Assignment Details¶


Module Code: CSMAD21
Assignment Report Title: Appled Data Science with Python


Table of Contents

    • Overview of Python Libraries used and Imports
    • Task1
      • Data Understanding
        • Data Overview and Description
      • Data Preprocessing
        • Set Index
        • Data Cleaning
        • Data Summary
          • Data Info
          • Distributions
          • Value Counts
        • Data Reduction and Transformation
          • Data Transformation
          • Data Reduction
      • Statistical Tests
    • Task2
      • Data Understanding
        • Data Overview and Description
      • Performing PCA
      • Clustering
        • K-Means
        • Gaussian Mixed Modelling
        • DBSCAN
        • Compare Methods
    • Task3
      • Data Understanding
        • Data Overview and Description
      • Visualisation
      • Statistical Analysis
        • Properties
        • Statistics
          • Degree Centrality
          • Degree Distribution
          • Clustering Coefficient
          • Betweenness Centrality
          • Assortativity
    • Resources

Overview of Python Libraries used and Imports¶


  • We make use of multiple Python libraries to perform data understanding, preprocessing, clustering and statistical tests.
  • The libraries used (Sources mentioned below):
    • pandas
    • NumPy
    • Seaborn
    • Matplotlib and PyPlot
    • SciPy
    • Scikit Learn
    • Networkx
    • Warnings
In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px

%matplotlib inline

# Library needed for performing statistical tests
from scipy import stats

# Needed for normalisation
from sklearn import preprocessing
from sklearn import cluster, metrics
from sklearn.decomposition import PCA

# Libraries needed for training the clustering algorithms
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture as GMM
from sklearn.cluster import DBSCAN

# Import the silhouette_score function from sklearn.metrics
from sklearn.metrics import silhouette_score

# Library needed for social network graph
import networkx as nx

import warnings
warnings.filterwarnings('ignore')

Task1¶

Data Understanding¶


The metro bike dataset includes anonymized bike travel data from the Los Angeles Metro Bike Share (Source: https://bikeshare.metro.net/about/data/).

The rows correspond to a single bike trip, the columns are below:

Column Description
trip_id Locally unique integer that identifies the trip.
duration Length of trip in minutes.
start_time The date/time when the trip began, presented in ISO 8601 format in local time.
end_time The date/time when the trip ended, presented in ISO 8601 format in local time.
start_station The station ID where the trip originated (for station name and more information on each station see the Station Table).
start_lat The latitude of the station where the trip originated.
start_lon The longitude of the station where the trip originated.
end_station The station ID where the trip terminated (for station name and more information on each station see the Station Table).
end_lat The latitude of the station where the trip terminated.
end_lon The longitude of the station where the trip terminated.
bike_id Locally unique integer that identifies the bike.
plan_duration The number of days that the plan the passholder is using entitles them to ride; 0 is used for a single ride plan (Walk-up).
trip_route_category "Round Trip" for trips starting and ending at the same station or "One Way" for all other trips.
passholder_type The name of the passholder's plan.
bike_type The kind of bike used on the trip, including standard pedalpowered bikes, electric assist bikes, or smart bikes.

Data Overview and Description¶


In [2]:
# Load `metro.csv` into a dataframe `metro_bike` and displaying the first five rows.
metro_bike = pd.read_csv('metro.csv')
metro_bike.head()
Out[2]:
trip_id duration start_time end_time start_station start_lat start_lon end_station end_lat end_lon bike_id plan_duration trip_route_category passholder_type bike_type
0 124657107 5 7/1/2019 0:04 7/1/2019 0:09 4312 34.066990 -118.290878 4410 34.063351 -118.296799 6168 30 One Way Monthly Pass standard
1 124657587 9 7/1/2019 0:07 7/1/2019 0:16 3066 34.063389 -118.236160 3066 34.063389 -118.236160 17584 30 Round Trip Monthly Pass electric
2 124658068 5 7/1/2019 0:20 7/1/2019 0:25 4410 34.063351 -118.296799 4312 34.066990 -118.290878 18920 30 One Way Monthly Pass electric
3 124659747 20 7/1/2019 0:44 7/1/2019 1:04 3045 34.028511 -118.256668 4275 34.012520 -118.285896 6016 1 One Way Walk-up standard
4 124660227 27 7/1/2019 0:44 7/1/2019 1:11 3035 34.048401 -118.260948 3049 34.056969 -118.253593 5867 30 One Way Monthly Pass standard

Looking at the data above, we can see that the start_time and end_time aren't in the ISO format.

In [3]:
# use `.describe()` to get numerical columns
metro_bike.describe()
Out[3]:
trip_id duration start_station start_lat start_lon end_station end_lat end_lon plan_duration
count 9.212400e+04 92124.000000 92124.000000 89985.000000 89985.000000 92124.000000 88052.000000 88052.000000 92124.000000
mean 1.274286e+08 33.168588 3484.899690 34.034786 -118.287893 3480.271026 34.034895 -118.286699 60.290977
std 1.524134e+06 129.057841 611.483883 0.058803 0.073501 609.942741 0.058790 0.072628 111.141364
min 1.246571e+08 1.000000 3000.000000 33.710979 -118.495422 3000.000000 33.710979 -118.495422 1.000000
25% 1.261375e+08 6.000000 3029.000000 34.035801 -118.281181 3028.000000 34.037048 -118.280952 1.000000
50% 1.274911e+08 12.000000 3062.000000 34.046810 -118.258537 3062.000000 34.046810 -118.258537 30.000000
75% 1.287379e+08 22.000000 4285.000000 34.051941 -118.248253 4285.000000 34.051941 -118.248253 30.000000
max 1.303877e+08 1440.000000 4453.000000 34.177662 -118.231277 4453.000000 34.177662 -118.231277 999.000000

A brief glance at this description tells us that the column plan_duration has a maximum value of 999. This seems like an outlier and is flagged.

The duration column also seems to contain outliers as the maximum trip duration is 1440 which is 24 hours. The source mentions trip durations can last 24 hours, so it's left as is.

In [4]:
metro_bike.shape
Out[4]:
(92124, 15)

We can see that the dataset is large enough for any preprocessing that involves removal of rows.

In [5]:
metro_bike.dtypes
Out[5]:
trip_id                  int64
duration                 int64
start_time              object
end_time                object
start_station            int64
start_lat              float64
start_lon              float64
end_station              int64
end_lat                float64
end_lon                float64
bike_id                 object
plan_duration            int64
trip_route_category     object
passholder_type         object
bike_type               object
dtype: object

.dtypes tells the datatype of the start_time and end_time is object. For any date/time data, it must be a datetime datatype.

In [6]:
# Check for null/missing values to handle in data cleaning
metro_bike.isnull().sum()
Out[6]:
trip_id                   0
duration                  0
start_time                0
end_time                  0
start_station             0
start_lat              2139
start_lon              2139
end_station               0
end_lat                4072
end_lon                4072
bike_id                   0
plan_duration             0
trip_route_category       0
passholder_type           0
bike_type                 0
dtype: int64

Perform data cleaning next based on the information gleaned from this section.


Data Preprocessing¶


Set Index¶

In [7]:
# Since `trip_id` is unique for all the values, we can set it to be the index.
metro_bike.set_index('trip_id', inplace=True)
In [8]:
metro_bike.head()
Out[8]:
duration start_time end_time start_station start_lat start_lon end_station end_lat end_lon bike_id plan_duration trip_route_category passholder_type bike_type
trip_id
124657107 5 7/1/2019 0:04 7/1/2019 0:09 4312 34.066990 -118.290878 4410 34.063351 -118.296799 6168 30 One Way Monthly Pass standard
124657587 9 7/1/2019 0:07 7/1/2019 0:16 3066 34.063389 -118.236160 3066 34.063389 -118.236160 17584 30 Round Trip Monthly Pass electric
124658068 5 7/1/2019 0:20 7/1/2019 0:25 4410 34.063351 -118.296799 4312 34.066990 -118.290878 18920 30 One Way Monthly Pass electric
124659747 20 7/1/2019 0:44 7/1/2019 1:04 3045 34.028511 -118.256668 4275 34.012520 -118.285896 6016 1 One Way Walk-up standard
124660227 27 7/1/2019 0:44 7/1/2019 1:11 3035 34.048401 -118.260948 3049 34.056969 -118.253593 5867 30 One Way Monthly Pass standard

Data Cleaning¶

We perform data cleaning in this subsection to remove any missing values, by using .dropna

In [9]:
# `inplace` is added as a parameter to modify the current dataframe itself.
metro_bike.dropna(inplace=True)
metro_bike.isnull().sum()
Out[9]:
duration               0
start_time             0
end_time               0
start_station          0
start_lat              0
start_lon              0
end_station            0
end_lat                0
end_lon                0
bike_id                0
plan_duration          0
trip_route_category    0
passholder_type        0
bike_type              0
dtype: int64
In [10]:
metro_bike.shape
Out[10]:
(86760, 14)

The shape of the dataset has changed, the rows have been reduced to 86760. Clearly, there were overlaps present in the missing values.

Data Summary¶


Data Info¶

In [11]:
# `.info` is used to find a brief overview of the non null rows and data type.
metro_bike.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 86760 entries, 124657107 to 130053088
Data columns (total 14 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   duration             86760 non-null  int64  
 1   start_time           86760 non-null  object 
 2   end_time             86760 non-null  object 
 3   start_station        86760 non-null  int64  
 4   start_lat            86760 non-null  float64
 5   start_lon            86760 non-null  float64
 6   end_station          86760 non-null  int64  
 7   end_lat              86760 non-null  float64
 8   end_lon              86760 non-null  float64
 9   bike_id              86760 non-null  object 
 10  plan_duration        86760 non-null  int64  
 11  trip_route_category  86760 non-null  object 
 12  passholder_type      86760 non-null  object 
 13  bike_type            86760 non-null  object 
dtypes: float64(4), int64(4), object(6)
memory usage: 9.9+ MB

Distributions¶

In [12]:
metro_bike.duration.hist(bins=50, range=(1,45))

plt.xlabel("Minutes")
plt.ylabel("Frequency")
Out[12]:
Text(0, 0.5, 'Frequency')

The histogram gives us a distribution of the duration of a single bike journey, ranging from 1 minute to 45 minutes.

In [13]:
# Use .mean() and .median() for mean and median values
metro_bike.duration.mean(), metro_bike.duration.median()
Out[13]:
(26.99640387275242, 11.0)
In [14]:
metro_bike.hist(figsize=(10, 10))
Out[14]:
array([[<AxesSubplot:title={'center':'duration'}>,
        <AxesSubplot:title={'center':'start_station'}>,
        <AxesSubplot:title={'center':'start_lat'}>],
       [<AxesSubplot:title={'center':'start_lon'}>,
        <AxesSubplot:title={'center':'end_station'}>,
        <AxesSubplot:title={'center':'end_lat'}>],
       [<AxesSubplot:title={'center':'end_lon'}>,
        <AxesSubplot:title={'center':'plan_duration'}>, <AxesSubplot:>]],
      dtype=object)

Multiple columns of the dataset (numerical values) are displayed in the histogram above.

We create a new dataframe where we filter the values to display durations less than 6 hours and plot the new histogram.

In [15]:
metro_bike_new = metro_bike[metro_bike.duration < 360]
metro_bike_new.shape
Out[15]:
(85974, 14)
In [16]:
metro_bike_new.duration.hist(bins=50, range=(1,90))

plt.xlabel("Minutes")
plt.ylabel("Frequency")
Out[16]:
Text(0, 0.5, 'Frequency')

Value Counts¶

.value_counts() is the Python function to give a count of each unique value in the selected column.

In [17]:
metro_bike['plan_duration'].value_counts()
Out[17]:
30     55907
1      21451
365     9375
999       27
Name: plan_duration, dtype: int64

The plan_duration explores the number of days a user can rent bikes and we plot it below using a Seaborn barplot.

In [18]:
p_d = sns.countplot(x='plan_duration', data=metro_bike)
for p in p_d.patches:
    p_d.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
p_d.set(xlabel='Plan Duration', ylabel='Count')
p_d.set_title('Count of Plan Duration')
Out[18]:
Text(0.5, 1.0, 'Count of Plan Duration')

The count of trip_route_category explores the number of one-way and round trip journeys undertaken by riders and plot it.

In [19]:
metro_bike['trip_route_category'].value_counts()
Out[19]:
One Way       72029
Round Trip    14731
Name: trip_route_category, dtype: int64
In [20]:
sns.set(rc={'figure.figsize':(11,9)})
trip_cat = sns.countplot(x='trip_route_category', data=metro_bike)
for p in trip_cat.patches:
    trip_cat.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
trip_cat.set(xlabel='Trip Route', ylabel='Count')
trip_cat.set_title('Count of various Trip Categories')
Out[20]:
Text(0.5, 1.0, 'Count of various Trip Categories')

We look at the various passes available for purchase to users.

In [21]:
metro_bike['passholder_type'].value_counts()
Out[21]:
Monthly Pass    55904
Walk-up         21258
Annual Pass      5966
One Day Pass     3599
Testing            27
Flex Pass           6
Name: passholder_type, dtype: int64

An interesting observation is how the Testing pass has the same value count as the count of 999 in the plan_duration column. An assumption is when a user wants to test out the service, LA Metro Bike gives a pass whose plan is valid for 999 days to not cause disruption to the other data available.

In [22]:
sns.set(rc={'figure.figsize':(11,7)})
pass_type = sns.countplot(x='passholder_type', data=metro_bike)
for p in pass_type.patches:
    pass_type.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
pass_type.set(xlabel='Types of Passes', ylabel='Count')
pass_type.set_title('Count of various Passes Available')
Out[22]:
Text(0.5, 1.0, 'Count of various Passes Available')

The count of Flex and Testing passes are low, hence they don't show up on the plot.

In [23]:
metro_bike['bike_type'].value_counts()
Out[23]:
electric    45818
standard    28966
smart       11976
Name: bike_type, dtype: int64

Exploring count of various bike types, we see electric bikes are chosen by most users for their travel needs and we plot the same.

In [24]:
sns.set(rc={'figure.figsize':(11,8)})
bike_type = sns.countplot(x='bike_type', data=metro_bike)
for p in bike_type.patches:
    bike_type.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
bike_type.set(xlabel='Types of Bikes', ylabel='Count')
bike_type.set_title('Count of Various Bikes Available')
Out[24]:
Text(0.5, 1.0, 'Count of Various Bikes Available')

A distplot is a histogram of the entire dataset and it is used to produce a plot and fit it to a probability density function.

In [25]:
pdf = sns.distplot(metro_bike['plan_duration'])
pdf.set(xlabel='Plan Duration', ylabel='Density')
pdf.set_title('Probability Density Function of Plan Duration')
plt.show()

Looking at the hisotgram above, we can see a curve which represents the probability density function based on plan_duration


Data Reduction and Transformation¶


Data Transformation¶

pd.to_datetime is used on the start_time and end_time columns to convert the columns to a datetime datatype and ISO format. Time series data is always in the datetime datatype.

In [26]:
metro_bike['start_time'] = pd.to_datetime(metro_bike['start_time'])
metro_bike['end_time'] = pd.to_datetime(metro_bike['end_time'])

metro_bike.head()
Out[26]:
duration start_time end_time start_station start_lat start_lon end_station end_lat end_lon bike_id plan_duration trip_route_category passholder_type bike_type
trip_id
124657107 5 2019-07-01 00:04:00 2019-07-01 00:09:00 4312 34.066990 -118.290878 4410 34.063351 -118.296799 6168 30 One Way Monthly Pass standard
124657587 9 2019-07-01 00:07:00 2019-07-01 00:16:00 3066 34.063389 -118.236160 3066 34.063389 -118.236160 17584 30 Round Trip Monthly Pass electric
124658068 5 2019-07-01 00:20:00 2019-07-01 00:25:00 4410 34.063351 -118.296799 4312 34.066990 -118.290878 18920 30 One Way Monthly Pass electric
124659747 20 2019-07-01 00:44:00 2019-07-01 01:04:00 3045 34.028511 -118.256668 4275 34.012520 -118.285896 6016 1 One Way Walk-up standard
124660227 27 2019-07-01 00:44:00 2019-07-01 01:11:00 3035 34.048401 -118.260948 3049 34.056969 -118.253593 5867 30 One Way Monthly Pass standard
In [27]:
metro_bike.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 86760 entries, 124657107 to 130053088
Data columns (total 14 columns):
 #   Column               Non-Null Count  Dtype         
---  ------               --------------  -----         
 0   duration             86760 non-null  int64         
 1   start_time           86760 non-null  datetime64[ns]
 2   end_time             86760 non-null  datetime64[ns]
 3   start_station        86760 non-null  int64         
 4   start_lat            86760 non-null  float64       
 5   start_lon            86760 non-null  float64       
 6   end_station          86760 non-null  int64         
 7   end_lat              86760 non-null  float64       
 8   end_lon              86760 non-null  float64       
 9   bike_id              86760 non-null  object        
 10  plan_duration        86760 non-null  int64         
 11  trip_route_category  86760 non-null  object        
 12  passholder_type      86760 non-null  object        
 13  bike_type            86760 non-null  object        
dtypes: datetime64[ns](2), float64(4), int64(4), object(4)
memory usage: 11.9+ MB

Data Reduction¶

To perform aggregations on duration, we must split the start_time column into separate columns.

In [28]:
metro_bike['start_day'] = metro_bike['start_time'].dt.dayofweek

days_of_week = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

metro_bike['start_month'] = metro_bike['start_time'].dt.month_name()
metro_bike['start_hour'] = metro_bike['start_time'].dt.hour
metro_bike['start_date'] = metro_bike['start_time'].dt.date
metro_bike['starting_time'] = metro_bike['start_time'].dt.time
metro_bike['end_hour'] = metro_bike['end_time'].dt.hour

metro_bike.head()
Out[28]:
duration start_time end_time start_station start_lat start_lon end_station end_lat end_lon bike_id plan_duration trip_route_category passholder_type bike_type start_day start_month start_hour start_date starting_time end_hour
trip_id
124657107 5 2019-07-01 00:04:00 2019-07-01 00:09:00 4312 34.066990 -118.290878 4410 34.063351 -118.296799 6168 30 One Way Monthly Pass standard 0 July 0 2019-07-01 00:04:00 0
124657587 9 2019-07-01 00:07:00 2019-07-01 00:16:00 3066 34.063389 -118.236160 3066 34.063389 -118.236160 17584 30 Round Trip Monthly Pass electric 0 July 0 2019-07-01 00:07:00 0
124658068 5 2019-07-01 00:20:00 2019-07-01 00:25:00 4410 34.063351 -118.296799 4312 34.066990 -118.290878 18920 30 One Way Monthly Pass electric 0 July 0 2019-07-01 00:20:00 0
124659747 20 2019-07-01 00:44:00 2019-07-01 01:04:00 3045 34.028511 -118.256668 4275 34.012520 -118.285896 6016 1 One Way Walk-up standard 0 July 0 2019-07-01 00:44:00 1
124660227 27 2019-07-01 00:44:00 2019-07-01 01:11:00 3035 34.048401 -118.260948 3049 34.056969 -118.253593 5867 30 One Way Monthly Pass standard 0 July 0 2019-07-01 00:44:00 1

To begin with, we start by plotting the daily count of the bikes rented and we find most are rented on Tuesday with Monday coming in next.

In [29]:
metro_bike.groupby('start_day').count()['bike_type'].plot(figsize=(12,8))
plt.xlabel("Days of the Week")
plt.ylabel("Frequency of the Bikes rented")
plt.title("Daily Bike Frequency")
plt.show()

Similarly, we plot the bike rentals against the day of the week and the above observation holds true.

In [30]:
count_day = sns.countplot(x='start_day', data=metro_bike)
for p in count_day.patches:
    count_day.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
count_day.set(xlabel='Day of the Week', ylabel='Count')
count_day.set_title('Count of Bikes based on Days')
Out[30]:
Text(0.5, 1.0, 'Count of Bikes based on Days')
In [31]:
# Plot bike ride for each hour of the day for the entire week
plt.figure(figsize=(24,8))

# Plot the countplot with a legend to the side
sns.countplot(x='start_day', hue='start_hour', data=metro_bike, palette='terrain')
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=2)
Out[31]:
<matplotlib.legend.Legend at 0x7f98569b30d0>
In [32]:
# Create a sunburst plot using the 'start_hour' and 'start_station' columns
fig = px.sunburst(metro_bike, path=['start_hour', 'start_station'], color='start_hour')

fig.update_layout(title='Start Hour vs Start Station')
fig.show()
In [33]:
# Create a sunburst plot using the 'end_hour' and 'end_station' columns
fig = px.sunburst(metro_bike, path=['end_hour', 'end_station'], color='end_station')

fig.update_layout(title='End hour vs End Station')
fig.show()

We can see that the start_station and end_station is usually centered around 3030 and 3014, both being at the same hour, 16th and 17th.

We find most bikes are rented on Tuesday at the 17th hour.

We can also focus on finding how long the trip duration lasts based on the starting hour of each day of the week. This is done by using the extracted columns start_hour and start_day and plotting them against duration.

In [34]:
# Group the data by hour and day of the week and compute the mean duration
mean_duration = metro_bike.groupby(['start_hour', 'start_day'])['duration'].mean().reset_index()

# Loop over the days of the week
for day in range(7):
    # Select the data for the current day
    data = mean_duration[mean_duration['start_day'] == day]
    
    # Create a barplot
    duration_plot = sns.barplot(x='start_hour', y='duration', data=data, palette='RdYlBu')
    
    duration_plot.set(xlabel='Start Hour', ylabel='Duration')
    
    # Set the title of the plot
    duration_plot.set_title(days_of_week[day])
    
    # Show the plot
    plt.show()

The above plots tell us that the starting hour of renting a bike changes based on each day, constant for the first few days. Sunday has the most parity in mean duration of the trips.
When it comes to weekdays, the users generally take the bikes for longer duration.

We can split the plots into two - one plot of the mean duration vs starting_hour and another of the mean duration vs day of the week

In [35]:
mean_duration1 = metro_bike.groupby(['start_hour'])['duration'].mean().reset_index()

dur_hour = sns.lineplot(x='start_hour', y='duration', data=mean_duration1)
for line in range(0, mean_duration1.shape[0]):
    dur_hour.text(mean_duration1.start_hour[line], mean_duration1.duration[line], round(mean_duration1.duration[line], 1), 
            horizontalalignment='left', size='medium', color='grey', weight='semibold')
dur_hour.set(xlabel='Start Hour', ylabel='Mean Duration')
dur_hour.set_title('Mean Duration vs Starting Hour')
Out[35]:
Text(0.5, 1.0, 'Mean Duration vs Starting Hour')

Lineplot gives a clearer idea of the mean duration of every bike trip based on the hour the trip began.
We can determine the mean duration of the trip is almost 90 minutes when the starting_hour value is 3 or the 3rd hour after midnight. The least duration seems to be right after the day usually begins around the 7th hour.

In [36]:
mean_duration2 = metro_bike.groupby(['start_day'])['duration'].mean().reset_index()

dur_day = sns.lineplot(x='start_day', y='duration', data=mean_duration2)
for line in range(0, mean_duration2.shape[0]):
    dur_day.text(mean_duration2.start_day[line], mean_duration2.duration[line], round(mean_duration2.duration[line], 1), 
            horizontalalignment='left', size='medium', color='green', weight='semibold')
dur_day.set(xlabel='Day of the Week', ylabel='Mean Duration')
dur_day.set_title('Mean Duration vs Day of the Week')
Out[36]:
Text(0.5, 1.0, 'Mean Duration vs Day of the Week')

The above plot tells us that day 6 or Sunday is the day with the longest mean duration. This can correlate with the fact that weekend is when most riders will have more free time to spend a leisure day. On the other hand, Friday sees the least mean durations.

In [37]:
metro_bike['start_month'].value_counts()
Out[37]:
August       30876
September    28716
July         27168
Name: start_month, dtype: int64

August had the most bikes rented and the plot confirms it.

In [38]:
sns.set(rc={'figure.figsize':(11,8)})
month_count = sns.countplot(x='start_month', data=metro_bike)
for p in month_count.patches:
    month_count.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
month_count.set(xlabel='Month', ylabel='Count')
month_count.set_title('Count of Bikes rented based on Month')
Out[38]:
Text(0.5, 1.0, 'Count of Bikes rented based on Month')

Using a pie chart to visualise the above against duration gives a better idea of the percentage of users.

In [39]:
labels = ['August','September','July']
plt.pie(x=metro_bike.groupby('start_month').count()['duration'].sort_values(ascending=False), 
        autopct='%1.1f%%',labels = labels)
plt.title("Percentile Distribution per Month")
plt.show()

Count of the bikes rented plotted against the start_hour gives the result that the 17th hour of the day is the busiest for bike rentals, correlating with end of the workday.

In [40]:
hour_count = sns.countplot(x='start_hour', data=metro_bike)
for p in hour_count.patches:
    hour_count.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
hour_count.set(xlabel='Hour', ylabel='Count')
hour_count.set_title('Count of Bikes rented based on Hour of the Day')
Out[40]:
Text(0.5, 1.0, 'Count of Bikes rented based on Hour of the Day')
In [41]:
# Create a cross-tabulation of the 'passholder_type' and 'bike_type' columns
pass_bike = pd.crosstab(metro_bike['passholder_type'], metro_bike['bike_type'])

# Plot the cross-tabulation as a bar plot
ax = pass_bike.plot(kind='bar', figsize=(15, 15))

for p in ax.patches:
    ax.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')

plt.xlabel('Passholder Type')
plt.ylabel('Count')
plt.title('Passholder Type over various Bike Types')
Out[41]:
Text(0.5, 1.0, 'Passholder Type over various Bike Types')

The various passholder_types are plotted against bike_types to get an idea of the kind of bikes rented by the users.

The duration of the various bike types mentioned in bike_type column are plotted to find the mean duration of each. We find smart bikes have a longer mean duration, the values of the others are almost half of smart bikes.

In [42]:
mean_duration3 = metro_bike.groupby('bike_type')['duration'].mean().reset_index()

dur_bike = sns.barplot(x='bike_type', y='duration', data=mean_duration3, color='red', palette='pastel')
for p in dur_bike.patches:
    dur_bike.text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
dur_bike.set(xlabel='Bike Type', ylabel='Duration')
dur_bike.set_title('Mean Duration of Various Bike Types')
Out[42]:
Text(0.5, 1.0, 'Mean Duration of Various Bike Types')

A violinplot and swarmplot were also attempted for the above aggregation but barplot is better for comparing means of different groups and was preferred. Violinplot is useful for distribution of data and swarmplot is another form of scatterplot for individual data points.


Statistical Tests¶


We perform t-test on the passholder_type and mean duration of each. The result displays the p-value of each passholder_type against the other to ensure there is parity in the result to find which passholders are significantly different.

In [43]:
metro_by_passholder = metro_bike.groupby("passholder_type")

duration_by_passholder = metro_by_passholder["duration"]

mean_duration_by_passholder = duration_by_passholder.mean()

# Create a figure with multiple subplots
fig, axs = plt.subplots(nrows=len(mean_duration_by_passholder.index)-1, ncols=1, 
                        figsize=(20, 6*len(mean_duration_by_passholder.index)))

# Loop through each pair of passholder types and plot the t-statistic values
for i, passholder_type1 in enumerate(mean_duration_by_passholder.index[:-1]):
  for j, passholder_type2 in enumerate(mean_duration_by_passholder.index[i+1:]):
    t_stat, p_value = stats.ttest_ind(duration_by_passholder.get_group(passholder_type1), 
                                      duration_by_passholder.get_group(passholder_type2))
    print(f'{passholder_type1} vs. {passholder_type2}: p-value = {p_value}')
    axs[i].bar(f"{passholder_type1} vs. {passholder_type2}", p_value)
    axs[i].set_title(f"p-value for {passholder_type1}")
    for p in axs[i].patches:
        axs[i].text(p.get_x() + p.get_width()/2, p.get_height(), p.get_height(), ha='center')
    axs[i].set_xlabel("Passholder types")
    axs[i].set_ylabel("p-value")

plt.show()
Annual Pass vs. Flex Pass: p-value = 0.8158212864252368
Annual Pass vs. Monthly Pass: p-value = 0.0031987234897233076
Annual Pass vs. One Day Pass: p-value = 8.288939500446688e-155
Annual Pass vs. Testing: p-value = 0.2395399041543238
Annual Pass vs. Walk-up: p-value = 7.970594234589781e-87
Flex Pass vs. Monthly Pass: p-value = 0.6891909679056905
Flex Pass vs. One Day Pass: p-value = 0.30731294748527543
Flex Pass vs. Testing: p-value = 0.4103221827358773
Flex Pass vs. Walk-up: p-value = 0.46382909975284303
Monthly Pass vs. One Day Pass: p-value = 0.0
Monthly Pass vs. Testing: p-value = 0.19523283057602708
Monthly Pass vs. Walk-up: p-value = 0.0
One Day Pass vs. Testing: p-value = 0.11446433041037984
One Day Pass vs. Walk-up: p-value = 3.814834116252384e-19
Testing vs. Walk-up: p-value = 0.3668698793884352

The above code is used to perform t-test on every passholder_type and the mean duration of each passholder. It calculates the t-statistic and p-value for the t-test that is used to determine if the mean duration differs significantly. According to the test hypothesis, if the p-value is less than 0.05, there is a significant difference in the mean duration between two passholder types and the null hypothesis is rejected.

Looking at the above graphs and the p-values, we can determine that the Annual Pass siginificantly differs from most other passes and the null hypothesis is rejected.
We can also reject the null hypothesis for One Day Pass vs Walk-up pass as the p-value is lesser than 0.05

Task2¶

Data Understanding¶


The seeds dataset has been sourced from the UCI Machine Learning repository. (Source: https://archive.ics.uci.edu/ml/datasets/seeds)

The dataset contains the metrics of seeds from several different plant species where each row is understood to be a single seed's measurement details.

Column Description
area A, the area of the seed.
perimter P, the length of the perimeter of the seed.
compactness A measure of the area of the seed relative to the perimeter,(4πA/P2)
length The length of the seed.
width The width of the seed.
asymmetry A measure of the asymmetry of the seed.
groove_length The length of the groove in the seed.

Data Overview and Description¶


In [44]:
# Load the dataset `seeds.csv` into a dataframe `seeds`
seeds = pd.read_csv('seeds.csv')
seeds.head()
Out[44]:
area perimeter compactness length width asymmetry groove_length
0 15.26 14.84 0.871 5.763 3.312 2.221 5.220
1 14.88 14.57 0.881 5.554 3.333 1.018 4.956
2 14.29 14.09 0.905 5.291 3.337 2.699 4.825
3 13.84 13.94 0.895 5.324 3.379 2.259 4.805
4 16.14 14.99 0.903 5.658 3.562 1.355 5.175

The next few cells explore understanding datatypes, statistics and null/missing values, along with creating visualisations to understand the data.

In [45]:
seeds.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 210 entries, 0 to 209
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   area           210 non-null    float64
 1   perimeter      210 non-null    float64
 2   compactness    210 non-null    float64
 3   length         210 non-null    float64
 4   width          210 non-null    float64
 5   asymmetry      210 non-null    float64
 6   groove_length  210 non-null    float64
dtypes: float64(7)
memory usage: 11.6 KB
In [46]:
seeds.describe()
Out[46]:
area perimeter compactness length width asymmetry groove_length
count 210.000000 210.000000 210.000000 210.000000 210.000000 210.000000 210.000000
mean 14.847524 14.559286 0.871000 5.628533 3.258605 3.700200 5.408071
std 2.909699 1.305959 0.023594 0.443063 0.377714 1.503559 0.491480
min 10.590000 12.410000 0.808000 4.899000 2.630000 0.765000 4.519000
25% 12.270000 13.450000 0.857250 5.262250 2.944000 2.561500 5.045000
50% 14.355000 14.320000 0.873500 5.523500 3.237000 3.599000 5.223000
75% 17.305000 15.715000 0.887750 5.979750 3.561750 4.768750 5.877000
max 21.180000 17.250000 0.918000 6.675000 4.033000 8.456000 6.550000
In [47]:
seeds.isna().sum()
Out[47]:
area             0
perimeter        0
compactness      0
length           0
width            0
asymmetry        0
groove_length    0
dtype: int64
In [48]:
seeds.duplicated().sum()
Out[48]:
0

A pairplot is implemented to create a matrix of scatterplots that show the relationship between two variables in a dataset.

In [49]:
pair_plot = sns.pairplot(seeds, hue='perimeter')
pair_plot.fig.suptitle('Plot of Seeds Data', y=1.03)
Out[49]:
Text(0.5, 1.03, 'Plot of Seeds Data')

A correlation matrix is utilised when plotting a heatmap to help understand the strength of the relationship between two variables.

In [50]:
corr_matrix = seeds.corr()
sns.set(rc={'figure.figsize':(12,10)})
sns.heatmap(corr_matrix, vmax=1, square=True, cmap='RdYlGn')
Out[50]:
<AxesSubplot:>
In [51]:
seeds.shape
Out[51]:
(210, 7)

Performing PCA¶


Our intention is to reduce the dimensions to make it easier as clustering algorithms perform better on datasets with fewer dimensions, hence we do PCA.

In [52]:
# Perform PCA to find two principal components before clustering for improving performance of the algorithms
pca = PCA(n_components=2)
seeds_pca = pca.fit_transform(seeds)
In [53]:
seeds_pca_df = pd.DataFrame(seeds_pca, columns=['col1', 'col2'])
In [54]:
pca_plot = sns.pairplot(seeds_pca_df);
pca_plot.fig.suptitle('Plot of Seeds PCA Data', y=1.03)
Out[54]:
Text(0.5, 1.03, 'Plot of Seeds PCA Data')

Clustering¶


K-Means¶

K-means clustering is a method of unsupervised learning that is used to partition a dataset into a specified number of clusters.

To perform k-means clustering, the user specifies the number of clusters to create and provides an initial set of k centroids. The algorithm then iteratively assigns each data point to the cluster with the closest centroid until convergence.

In [55]:
# Create an empty list to store the silhouette scores
silhouette_scores = []

# Loop through the number of clusters to use for k-means
for n in range(2, 8):
    # Perform k-means clustering on the PCA data
    kmeans = KMeans(n_clusters=n)
    seeds_pca_df[f'{n}_clusters'] = kmeans.fit_predict(seeds_pca_df)
    
    # Calculate the silhouette score for the current number of clusters
    score = silhouette_score(seeds_pca_df, seeds_pca_df[f'{n}_clusters'])
    
    # Append the silhouette score to the list
    silhouette_scores.append(score)

# Create a figure with 3 rows and 3 columns of subplots
fig, ax = plt.subplots(3, 2, sharex=True, sharey=True, figsize=(10, 8))

# Loop through each row and column in the figure
for row in range(3):
    for col in range(2):
        # Set the color palette for the current subplot
        n_clusters = 3*col + row + 2
        color_map = plt.cm.get_cmap('inferno', n_clusters)
        
        # Create a scatterplot on the current subplot using the 'PCA1' and 'PCA2' columns as the x and y values, respectively
        ax[row, col].scatter(seeds_pca_df['col1'], seeds_pca_df['col2'], 
                             c=seeds_pca_df[f'{n_clusters}_clusters'], cmap=color_map)
        
        # Set the title of the current subplot
        ax[row, col].set_title(f"{n_clusters} Clusters\nSilhouette Score = {silhouette_scores[n_clusters-2]:.3f}")
        
        # Add a grid to the current subplot
        ax[row, col].grid(True)

# Add a title to the figure
plt.suptitle("KMeans Clustering")

# Tighten the layout of the figure
fig.tight_layout()

The idea behind elbow method is, as the number of clusters increases, the WCSS will initially decrease rapidly as the clusters become more compact. After a point, it will start to decrease slower as the benefits of adding additional clusters diminishes. The elbow point is the value of k at which this transition occurs. For our dataset, the elbow is found to be cluster=3

In [56]:
# Arbitrarily selecting a range of values for K to perfrom elbow method
K = range(1,11)
sum_of_squared_distances = []
# Using Scikit Learn’s KMeans Algorithm to find sum of squared distances
for k in K:
    model = KMeans(n_clusters=k).fit(seeds_pca_df)
    sum_of_squared_distances.append(model.inertia_)
plt.plot(K, sum_of_squared_distances)
plt.xlabel('K values')
plt.ylabel('Sum of Squared Distances')
plt.title('Elbow Method')
plt.show()

Gaussian Mixed Modelling¶

GMM is a probabilistic model that is used to represent the distribution of a dataset as a mixture of multiple Gaussian distributions.

To fit GMM, a specified number of components are in the mixture to initialise the model with a set of parameters. The model is then trained using an iterative optimization procedure until convergence is reached, at which point the model is considered to be trained.

In [57]:
# Initialize lists to store the silhouette scores and WCSS
silhouette_scores = []

for p in range(2, 8):
    # Perform GMM clustering on the PCA data
    gmm = GMM(n_components=p)
    seeds_pca_df[f'{p}_clusters'] = gmm.fit_predict(seeds_pca_df)
    
    # Compute the silhouette score for the current clustering model
    silhouette_scores.append(silhouette_score(seeds_pca_df, seeds_pca_df[f'{p}_clusters']))

# Create a figure with 3 rows and 3 columns of subplots
fig, ax = plt.subplots(3, 2, sharex=True, sharey=True, figsize=(10, 8))

# Loop through each row and column in the figure
for row in range(3):
    for col in range(2):
        # Set the color palette for the current subplot
        n_clusters = 3*col + row + 2
        color_map = plt.cm.get_cmap('magma', n_clusters)
        
        # Create a scatterplot on the current subplot using the 'PCA1' and 'PCA2' columns as the x and y values, respectively
        ax[row, col].scatter(seeds_pca_df['col1'], seeds_pca_df['col2'], 
                             c=seeds_pca_df[f'{n_clusters}_clusters'], cmap=color_map)
        
        # Set the title of the current subplot
        ax[row, col].set_title(f"{n_clusters} Clusters\nSilhouette Score = {silhouette_scores[n_clusters-2]:.3f}")
        
        # Add a grid to the current subplot
        ax[row, col].grid(True)

# Add a title to the figure
plt.suptitle("GMM Clustering")

# Tighten the layout of the figure
fig.tight_layout()

DBSCAN¶

DBSCAN is an algorithm for clustering data points into clusters based on their density.

To use DBSCAN, the two parameters are eps and min_samples. Eps is the maximum distance between two points considered to be in the same cluster. Min_samples is the minimum number of points required to form a cluster.

DBSCAN cannot cluster the data accurately and hence we decide to skip it.

In [58]:
eps = 1.0
min_samples = 10

# Initialise and fit DBSCAN
db = DBSCAN(eps=eps, min_samples=min_samples).fit(seeds_pca_df)
labels = db.labels_

plt.scatter(seeds_pca_df['col1'], seeds_pca_df['col2'], s=15, c=labels, cmap='jet')
plt.title(f'DBSCAN, eps={eps}, min_samples={min_samples}, n_clusters={max(labels)+2}')
plt.gca().set_aspect('equal') 

Compare Methods¶

Looking at the results of performing PCA on the seeds dataset, we can see that it is uniformly distributed over the two dimensions.

This being a smaller dataset, K-means is a fast and efficient method to cluster. Since k-means relies on finding the right cluster points using a centroid until convergence is reached, the clusters are more accurate.

The reason to pick GMM is that it's very useful for probabilistic models of the dataset. It can be used to provide a measure of likelihood of each point belonging to the cluster. But for this dataset, we can see that some of the clusters aren't well formed, so we prefer k-means over the other two methods.

Task3¶

Data Understanding¶


The social network dataset has been taken from the Koblenz Network Collection (Source: http://konect.cc/)

It contains social network data that has been anonymised for analysis with users as nodes (numbered 1 to 2888) and edges being undirected.
Each row is an edge between two nodes of the network.

Data Overview and Description¶


In [59]:
# No column names in the dataset, so we add them
colnames=['nodes', 'edges'] 
In [60]:
# Load the dataset `social-network.csv` into `soc_net`
soc_net = pd.read_csv('social-network.csv', names=colnames, header=None)
soc_net.head()
Out[60]:
nodes edges
0 1 2
1 1 3
2 1 4
3 1 5
4 1 6

We describe the numerical statistics such as max nodes and edges and also check if there are any missing/null values to handle.

In [61]:
soc_net.describe()
Out[61]:
nodes edges
count 2981.000000 2981.000000
mean 970.580342 1458.960751
std 776.901860 839.483888
min 1.000000 2.000000
25% 288.000000 720.000000
50% 603.000000 1460.000000
75% 1525.000000 2202.000000
max 2699.000000 2888.000000
In [62]:
soc_net.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2981 entries, 0 to 2980
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   nodes   2981 non-null   int64
 1   edges   2981 non-null   int64
dtypes: int64(2)
memory usage: 46.7 KB

Pandas gives us the option to quickly create a graph directly from a dataframe.

In [63]:
# The function `nx.from_pandas_edgelist()` takes `nodes` and `edges` as parameters.
net_graph = nx.from_pandas_edgelist(soc_net, source='nodes', target='edges')
net_graph
Out[63]:
<networkx.classes.graph.Graph at 0x7f9862080430>
In [64]:
net_graph.number_of_nodes()
Out[64]:
2888
In [65]:
net_graph.number_of_edges() 
Out[65]:
2981

We can assume 2981 edges for 2888 nodes means some nodes have multiple edges and it will be interesting to explore the visualisation.

In [66]:
net_graph.nodes()
Out[66]:
NodeView((1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 1525, 603, 710, 714, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 711, 712, 713, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 2232, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 1259, 1260, 1261, 1262, 1263, 1264, 1265, 1266, 1267, 1268, 1269, 1270, 1271, 1272, 1273, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, 1303, 1304, 1305, 1306, 1307, 1308, 1309, 1310, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321, 1322, 1323, 1324, 1325, 1326, 1327, 1328, 1329, 1330, 1331, 1332, 1333, 1334, 1335, 1336, 1337, 1338, 1339, 1340, 1341, 1342, 1343, 1344, 1345, 1346, 1347, 1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359, 1360, 1361, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1373, 1374, 1375, 1376, 1377, 1378, 1379, 1380, 1381, 1382, 1383, 1384, 1385, 1386, 1387, 1388, 1389, 1390, 1391, 1392, 1393, 1394, 1395, 1396, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405, 1406, 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415, 1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1424, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434, 1435, 1436, 1437, 1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1448, 1449, 1450, 1451, 1452, 1453, 1454, 1455, 1456, 1457, 1458, 1459, 1460, 1461, 1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470, 1471, 1472, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481, 1482, 1483, 1484, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492, 1493, 1494, 1495, 1496, 1497, 1498, 1499, 1500, 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508, 1509, 1510, 1511, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1524, 2329, 2330, 2331, 2332, 2333, 2334, 2335, 2336, 2337, 2338, 2339, 2340, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 2370, 2371, 2372, 2373, 2374, 2375, 2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2435, 2436, 2437, 2438, 2439, 2440, 2441, 2442, 2443, 2444, 2445, 2446, 2447, 2448, 2449, 2450, 2451, 2452, 2453, 2454, 2455, 2456, 2457, 2458, 2459, 2460, 2461, 2462, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2482, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2492, 2493, 2494, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2504, 2505, 2506, 2507, 2508, 2509, 2510, 2511, 2512, 2513, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526, 2527, 2528, 2529, 2530, 2531, 2532, 2533, 2534, 2535, 2594, 2595, 2596, 2597, 2598, 2599, 2600, 2601, 2602, 2603, 2604, 2605, 2606, 2607, 2608, 2609, 2610, 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, 2620, 2621, 2622, 2623, 2624, 2625, 2626, 2627, 2628, 2629, 2630, 2631, 2632, 2633, 2634, 2635, 2636, 2637, 2638, 2639, 2640, 2641, 2642, 2643, 2644, 2645, 2646, 2647, 2648, 2649, 2650, 2651, 2652, 2653, 2654, 2655, 2656, 2657, 2658, 2659, 2660, 2661, 2662, 2663, 2664, 2665, 2666, 2667, 2668, 2669, 2670, 2671, 2672, 2673, 2674, 2675, 2676, 2677, 2678, 2679, 2680, 2681, 2682, 2683, 2684, 2685, 2686, 2699, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1538, 1539, 1540, 1541, 1542, 1543, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591, 1592, 1593, 1594, 1595, 1596, 1597, 1598, 1599, 1600, 1601, 1602, 1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610, 1611, 1612, 1613, 1614, 1615, 1616, 1617, 1618, 1619, 1620, 1621, 1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629, 1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638, 1639, 1640, 1641, 1642, 1643, 1644, 1645, 1646, 1647, 1648, 1649, 1650, 1651, 1652, 1653, 1654, 1655, 1656, 1657, 1658, 1659, 1660, 1661, 1662, 1663, 1664, 1665, 1666, 1667, 1668, 1669, 1670, 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, 1681, 1682, 1683, 1684, 1685, 1686, 1687, 1688, 1689, 1690, 1691, 1692, 1693, 1694, 1695, 1696, 1697, 1698, 1699, 1700, 1701, 1702, 1703, 1704, 1705, 1706, 1707, 1708, 1709, 1710, 1711, 1712, 1713, 1714, 1715, 1716, 1717, 1718, 1719, 1720, 1721, 1722, 1723, 1724, 1725, 1726, 1727, 1728, 1729, 1730, 1731, 1732, 1733, 1734, 1735, 1736, 1737, 1738, 1739, 1740, 1741, 1742, 1743, 1744, 1745, 1746, 1747, 1748, 1749, 1750, 1751, 1752, 1753, 1754, 1755, 1756, 1757, 1758, 1759, 1760, 1761, 1762, 1763, 1764, 1765, 1766, 1767, 1768, 1769, 1770, 1771, 1772, 1773, 1774, 1775, 1776, 1777, 1778, 1779, 1780, 1781, 1782, 1783, 1784, 1785, 1786, 1787, 1788, 1789, 1790, 1791, 1792, 1793, 1794, 1795, 1796, 1797, 1798, 1799, 1800, 1801, 1802, 1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1811, 1812, 1813, 1814, 1815, 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 1826, 1827, 1828, 1829, 1830, 1831, 1832, 1833, 1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857, 1858, 1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868, 1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890, 1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030, 2031, 2032, 2033, 2034, 2035, 2036, 2037, 2038, 2039, 2040, 2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048, 2049, 2050, 2051, 2052, 2053, 2054, 2055, 2056, 2057, 2058, 2059, 2060, 2061, 2062, 2063, 2064, 2065, 2066, 2067, 2068, 2069, 2070, 2071, 2072, 2073, 2074, 2075, 2076, 2077, 2078, 2079, 2080, 2081, 2082, 2083, 2084, 2085, 2086, 2087, 2088, 2089, 2090, 2091, 2092, 2093, 2094, 2095, 2096, 2097, 2098, 2099, 2100, 2101, 2102, 2103, 2104, 2105, 2106, 2107, 2108, 2109, 2110, 2111, 2112, 2113, 2114, 2115, 2116, 2117, 2118, 2119, 2120, 2121, 2122, 2123, 2124, 2125, 2126, 2127, 2128, 2129, 2130, 2131, 2132, 2133, 2134, 2135, 2136, 2137, 2138, 2139, 2140, 2141, 2142, 2143, 2144, 2145, 2146, 2147, 2148, 2149, 2150, 2151, 2152, 2153, 2154, 2155, 2156, 2157, 2158, 2159, 2160, 2161, 2162, 2163, 2164, 2165, 2166, 2167, 2168, 2169, 2170, 2171, 2172, 2173, 2174, 2175, 2176, 2177, 2178, 2179, 2180, 2181, 2182, 2183, 2184, 2185, 2186, 2187, 2188, 2189, 2190, 2191, 2192, 2193, 2194, 2195, 2196, 2197, 2198, 2199, 2200, 2201, 2202, 2203, 2204, 2205, 2206, 2207, 2208, 2209, 2210, 2211, 2212, 2213, 2214, 2215, 2216, 2217, 2218, 2219, 2220, 2221, 2222, 2223, 2224, 2225, 2226, 2227, 2228, 2229, 2230, 2231, 2233, 2234, 2235, 2236, 2237, 2238, 2239, 2240, 2241, 2242, 2243, 2244, 2245, 2246, 2247, 2248, 2249, 2250, 2251, 2252, 2253, 2254, 2255, 2256, 2257, 2258, 2259, 2260, 2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2274, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2286, 2287, 2288, 2289, 2290, 2291, 2292, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2311, 2312, 2313, 2314, 2315, 2316, 2317, 2318, 2319, 2320, 2321, 2322, 2323, 2324, 2325, 2326, 2327, 2328, 2536, 2537, 2538, 2539, 2540, 2541, 2542, 2543, 2544, 2545, 2546, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558, 2559, 2560, 2561, 2562, 2563, 2564, 2565, 2566, 2567, 2568, 2569, 2570, 2571, 2572, 2573, 2574, 2575, 2576, 2577, 2578, 2579, 2580, 2581, 2582, 2583, 2584, 2585, 2586, 2587, 2588, 2589, 2590, 2591, 2592, 2593, 2687, 2688, 2689, 2690, 2691, 2692, 2693, 2694, 2695, 2696, 2697, 2698, 2700, 2701, 2702, 2703, 2704, 2705, 2706, 2707, 2708, 2709, 2710, 2711, 2712, 2713, 2714, 2715, 2716, 2717, 2718, 2719, 2720, 2721, 2722, 2723, 2724, 2725, 2726, 2727, 2728, 2729, 2730, 2731, 2732, 2733, 2734, 2735, 2736, 2737, 2738, 2739, 2740, 2741, 2742, 2743, 2744, 2745, 2746, 2747, 2748, 2749, 2750, 2751, 2752, 2753, 2754, 2755, 2756, 2757, 2758, 2759, 2760, 2761, 2762, 2763, 2764, 2765, 2766, 2767, 2768, 2769, 2770, 2771, 2772, 2773, 2774, 2775, 2776, 2777, 2778, 2779, 2780, 2781, 2782, 2783, 2784, 2785, 2786, 2787, 2788, 2789, 2790, 2791, 2792, 2793, 2794, 2795, 2796, 2797, 2798, 2799, 2800, 2801, 2802, 2803, 2804, 2805, 2806, 2807, 2808, 2809, 2810, 2811, 2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 2820, 2821, 2822, 2823, 2824, 2825, 2826, 2827, 2828, 2829, 2830, 2831, 2832, 2833, 2834, 2835, 2836, 2837, 2838, 2839, 2840, 2841, 2842, 2843, 2844, 2845, 2846, 2847, 2848, 2849, 2850, 2851, 2852, 2853, 2854, 2855, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2864, 2865, 2866, 2867, 2868, 2869, 2870, 2871, 2872, 2873, 2874, 2875, 2876, 2877, 2878, 2879, 2880, 2881, 2882, 2883, 2884, 2885, 2886, 2887, 2888))
In [67]:
net_graph.edges()
Out[67]:
EdgeView([(1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (1, 17), (1, 18), (1, 19), (1, 20), (1, 21), (1, 22), (1, 23), (1, 24), (1, 25), (1, 26), (1, 27), (1, 28), (1, 29), (1, 30), (1, 31), (1, 32), (1, 33), (1, 34), (1, 35), (1, 36), (1, 37), (1, 38), (1, 39), (1, 40), (1, 41), (1, 42), (1, 43), (1, 44), (1, 45), (1, 46), (1, 47), (1, 48), (1, 49), (1, 50), (1, 51), (1, 52), (1, 53), (1, 54), (1, 55), (1, 56), (1, 57), (1, 58), (1, 59), (1, 60), (1, 61), (1, 62), (1, 63), (1, 64), (1, 65), (1, 66), (1, 67), (1, 68), (1, 69), (1, 70), (1, 71), (1, 72), (1, 73), (1, 74), (1, 75), (1, 76), (1, 77), (1, 78), (1, 79), (1, 80), (1, 81), (1, 82), (1, 83), (1, 84), (1, 85), (1, 86), (1, 87), (1, 88), (1, 89), (1, 90), (1, 91), (1, 92), (1, 93), (1, 94), (1, 95), (1, 96), (1, 97), (1, 98), (1, 99), (1, 100), (1, 101), (1, 102), (1, 103), (1, 104), (1, 105), (1, 106), (1, 107), (1, 108), (1, 109), (1, 110), (1, 111), (1, 112), (1, 113), (1, 114), (1, 115), (1, 116), (1, 117), (1, 118), (1, 119), (1, 120), (1, 121), (1, 122), (1, 123), (1, 124), (1, 125), (1, 126), (1, 127), (1, 128), (1, 129), (1, 130), (1, 131), (1, 132), (1, 133), (1, 134), (1, 135), (1, 136), (1, 137), (1, 138), (1, 139), (1, 140), (1, 141), (1, 142), (1, 143), (1, 144), (1, 145), (1, 146), (1, 147), (1, 148), (1, 149), (1, 150), (1, 151), (1, 152), (1, 153), (1, 154), (1, 155), (1, 156), (1, 157), (1, 158), (1, 159), (1, 160), (1, 161), (1, 162), (1, 163), (1, 164), (1, 165), (1, 166), (1, 167), (1, 168), (1, 169), (1, 170), (1, 171), (1, 172), (1, 173), (1, 174), (1, 175), (1, 176), (1, 177), (1, 178), (1, 179), (1, 180), (1, 181), (1, 182), (1, 183), (1, 184), (1, 185), (1, 186), (1, 187), (1, 188), (1, 189), (1, 190), (1, 191), (1, 192), (1, 193), (1, 194), (1, 195), (1, 196), (1, 197), (1, 198), (1, 199), (1, 200), (1, 201), (1, 202), (1, 203), (1, 204), (1, 205), (1, 206), (1, 207), (1, 208), (1, 209), (1, 210), (1, 211), (1, 212), (1, 213), (1, 214), (1, 215), (1, 216), (1, 217), (1, 218), (1, 219), (1, 220), (1, 221), (1, 222), (1, 223), (1, 224), (1, 225), (1, 226), (1, 227), (1, 228), (1, 229), (1, 230), (1, 231), (1, 232), (1, 233), (1, 234), (1, 235), (1, 236), (1, 237), (1, 238), (1, 239), (1, 240), (1, 241), (1, 242), (1, 243), (1, 244), (1, 245), (1, 246), (1, 247), (1, 248), (1, 249), (1, 250), (1, 251), (1, 252), (1, 253), (1, 254), (1, 255), (1, 256), (1, 257), (1, 258), (1, 259), (1, 260), (1, 261), (1, 262), (1, 263), (1, 264), (1, 265), (1, 266), (1, 267), (1, 268), (1, 269), (1, 270), (1, 271), (1, 272), (1, 273), (1, 274), (1, 275), (1, 276), (1, 277), (1, 278), (1, 279), (1, 280), (1, 281), (1, 282), (1, 283), (1, 284), (1, 285), (1, 286), (1, 287), (1, 288), (35, 1525), (69, 603), (71, 710), (71, 714), (90, 710), (217, 710), (247, 288), (247, 603), (247, 1525), (288, 289), (288, 290), (288, 291), (288, 292), (288, 293), (288, 294), (288, 295), (288, 296), (288, 297), (288, 298), (288, 299), (288, 300), (288, 301), (288, 302), (288, 303), (288, 304), (288, 305), (288, 306), (288, 307), (288, 308), (288, 309), (288, 310), (288, 311), (288, 312), (288, 313), (288, 314), (288, 315), (288, 316), (288, 317), (288, 318), (288, 319), (288, 320), (288, 321), (288, 322), (288, 323), (288, 324), (288, 325), (288, 326), (288, 327), (288, 328), (288, 329), (288, 330), (288, 331), (288, 332), (288, 333), (288, 334), (288, 335), (288, 336), (288, 337), (288, 338), (288, 339), (288, 340), (288, 341), (288, 342), (288, 343), (288, 344), (288, 345), (288, 346), (288, 347), (288, 348), (288, 349), (288, 350), (288, 351), (288, 352), (288, 353), (288, 354), (288, 355), (288, 356), (288, 357), (288, 358), (288, 359), (288, 360), (288, 361), (288, 362), (288, 363), (288, 364), (288, 365), (288, 366), (288, 367), (288, 368), (288, 369), (288, 370), (288, 371), (288, 372), (288, 373), (288, 374), (288, 375), (288, 376), (288, 377), (288, 378), (288, 379), (288, 380), (288, 381), (288, 382), (288, 383), (288, 384), (288, 385), (288, 386), (288, 387), (288, 388), (288, 389), (288, 390), (288, 391), (288, 392), (288, 393), (288, 394), (288, 395), (288, 396), (288, 397), (288, 398), (288, 399), (288, 400), (288, 401), (288, 402), (288, 403), (288, 404), (288, 405), (288, 406), (288, 407), (288, 408), (288, 409), (288, 410), (288, 411), (288, 412), (288, 413), (288, 414), (288, 415), (288, 416), (288, 417), (288, 418), (288, 419), (288, 420), (288, 421), (288, 422), (288, 423), (288, 424), (288, 425), (288, 426), (288, 427), (288, 428), (288, 429), (288, 430), (288, 431), (288, 432), (288, 433), (288, 434), (288, 435), (288, 436), (288, 437), (288, 438), (288, 439), (288, 440), (288, 441), (288, 442), (288, 443), (288, 444), (288, 445), (288, 446), (288, 447), (288, 448), (288, 449), (288, 450), (288, 451), (288, 452), (288, 453), (288, 454), (288, 455), (288, 456), (288, 457), (288, 458), (288, 459), (288, 460), (288, 461), (288, 462), (288, 463), (288, 464), (288, 465), (288, 466), (288, 467), (288, 468), (288, 469), (288, 470), (288, 471), (288, 472), (288, 473), (288, 474), (288, 475), (288, 476), (288, 477), (288, 478), (288, 479), (288, 480), (288, 481), (288, 482), (288, 483), (288, 484), (288, 485), (288, 486), (288, 487), (288, 488), (288, 489), (288, 490), (288, 491), (288, 492), (288, 493), (288, 494), (288, 495), (288, 496), (288, 497), (288, 498), (288, 499), (288, 500), (288, 501), (288, 502), (288, 503), (288, 504), (288, 505), (288, 506), (288, 507), (288, 508), (288, 509), (288, 510), (288, 511), (288, 512), (288, 513), (288, 514), (288, 515), (288, 516), (288, 517), (288, 518), (288, 519), (288, 520), (288, 521), (288, 522), (288, 523), (288, 524), (288, 525), (288, 526), (288, 527), (288, 528), (288, 529), (288, 530), (288, 531), (288, 532), (288, 533), (288, 534), (288, 535), (288, 536), (288, 537), (288, 538), (288, 539), (288, 540), (288, 541), (288, 542), (288, 543), (288, 544), (288, 545), (288, 546), (288, 547), (288, 548), (288, 549), (288, 550), (288, 551), (288, 552), (288, 553), (288, 554), (288, 555), (288, 556), (288, 557), (288, 558), (288, 559), (288, 560), (288, 561), (288, 562), (288, 563), (288, 564), (288, 565), (288, 566), (288, 567), (288, 568), (288, 569), (288, 570), (288, 571), (288, 572), (288, 573), (288, 574), (288, 575), (288, 576), (288, 577), (288, 578), (288, 579), (288, 580), (288, 581), (288, 582), (288, 583), (288, 584), (288, 585), (288, 586), (288, 587), (288, 588), (288, 589), (288, 590), (288, 591), (288, 592), (288, 593), (288, 594), (288, 595), (288, 596), (288, 597), (288, 598), (288, 599), (288, 600), (288, 601), (288, 602), (288, 603), (288, 604), (288, 605), (288, 606), (288, 607), (288, 608), (288, 609), (288, 610), (288, 611), (288, 612), (288, 613), (288, 614), (288, 615), (288, 616), (288, 617), (288, 618), (288, 619), (288, 620), (288, 621), (288, 622), (288, 623), (288, 624), (288, 625), (288, 626), (288, 627), (288, 628), (288, 629), (288, 630), (288, 631), (288, 632), (288, 633), (288, 634), (288, 635), (288, 636), (288, 637), (288, 638), (288, 639), (288, 640), (288, 641), (288, 642), (288, 643), (288, 644), (288, 645), (288, 646), (288, 647), (288, 648), (288, 649), (288, 650), (288, 651), (288, 652), (288, 653), (288, 654), (288, 655), (288, 656), (288, 657), (288, 658), (288, 659), (288, 660), (288, 661), (288, 662), (288, 663), (288, 664), (288, 665), (288, 666), (288, 667), (288, 668), (288, 669), (288, 670), (288, 671), (288, 672), (288, 673), (288, 674), (288, 675), (288, 676), (288, 677), (288, 678), (288, 679), (288, 680), (288, 681), (288, 682), (288, 683), (288, 684), (288, 685), (288, 686), (288, 687), (288, 688), (288, 689), (288, 690), (288, 691), (288, 692), (288, 693), (288, 694), (288, 695), (288, 696), (288, 697), (288, 698), (288, 699), (288, 700), (288, 701), (288, 702), (288, 703), (288, 704), (288, 705), (288, 706), (288, 707), (288, 708), (288, 709), (288, 710), (288, 711), (288, 712), (288, 713), (288, 714), (288, 715), (288, 716), (288, 717), (288, 718), (288, 719), (288, 720), (288, 721), (288, 722), (288, 723), (288, 724), (288, 725), (288, 726), (288, 727), (288, 728), (288, 729), (288, 730), (288, 731), (288, 732), (288, 733), (288, 734), (288, 735), (288, 736), (288, 737), (288, 738), (288, 739), (288, 740), (288, 741), (288, 742), (288, 743), (288, 744), (288, 745), (288, 746), (288, 747), (288, 748), (288, 749), (288, 750), (288, 751), (288, 752), (288, 753), (288, 754), (288, 755), (288, 756), (288, 757), (288, 758), (288, 759), (288, 760), (288, 761), (288, 762), (288, 763), (288, 764), (288, 765), (288, 766), (288, 767), (1525, 716), (1525, 719), (1525, 1526), (1525, 1527), (1525, 1528), (1525, 1529), (1525, 1530), (1525, 1531), (1525, 1532), (1525, 1533), (1525, 1534), (1525, 1535), (1525, 1536), (1525, 1537), (1525, 1538), (1525, 1539), (1525, 1540), (1525, 1541), (1525, 1542), (1525, 1543), (1525, 1544), (1525, 1545), (1525, 1546), (1525, 1547), (1525, 1548), (1525, 1549), (1525, 1550), (1525, 1551), (1525, 1552), (1525, 1553), (1525, 1554), (1525, 1555), (1525, 1556), (1525, 1557), (1525, 1558), (1525, 1559), (1525, 1560), (1525, 1561), (1525, 1562), (1525, 1563), (1525, 1564), (1525, 1565), (1525, 1566), (1525, 1567), (1525, 1568), (1525, 1569), (1525, 1570), (1525, 1571), (1525, 1572), (1525, 1573), (1525, 1574), (1525, 1575), (1525, 1576), (1525, 1577), (1525, 1578), (1525, 1579), (1525, 1580), (1525, 1581), (1525, 1582), (1525, 1583), (1525, 1584), (1525, 1585), (1525, 1586), (1525, 1587), (1525, 1588), (1525, 1589), (1525, 1590), (1525, 1591), (1525, 1592), (1525, 1593), (1525, 1594), (1525, 1595), (1525, 1596), (1525, 1597), (1525, 1598), (1525, 1599), (1525, 1600), (1525, 1601), (1525, 1602), (1525, 1603), (1525, 1604), (1525, 1605), (1525, 1606), (1525, 1607), (1525, 1608), (1525, 1609), (1525, 1610), (1525, 1611), (1525, 1612), (1525, 1613), (1525, 1614), (1525, 1615), (1525, 1616), (1525, 1617), (1525, 1618), (1525, 1619), (1525, 1620), (1525, 1621), (1525, 1622), (1525, 1623), (1525, 1624), (1525, 1625), (1525, 1626), (1525, 1627), (1525, 1628), (1525, 1629), (1525, 1630), (1525, 1631), (1525, 1632), (1525, 1633), (1525, 1634), (1525, 1635), (1525, 1636), (1525, 1637), (1525, 1638), (1525, 1639), (1525, 1640), (1525, 1641), (1525, 1642), (1525, 1643), (1525, 1644), (1525, 1645), (1525, 1646), (1525, 1647), (1525, 1648), (1525, 1649), (1525, 1650), (1525, 1651), (1525, 1652), (1525, 1653), (1525, 1654), (1525, 1655), (1525, 1656), (1525, 1657), (1525, 1658), (1525, 1659), (1525, 1660), (1525, 1661), (1525, 1662), (1525, 1663), (1525, 1664), (1525, 1665), (1525, 1666), (1525, 1667), (1525, 1668), (1525, 1669), (1525, 1670), (1525, 1671), (1525, 1672), (1525, 1673), (1525, 1674), (1525, 1675), (1525, 1676), (1525, 1677), (1525, 1678), (1525, 1679), (1525, 1680), (1525, 1681), (1525, 1682), (1525, 1683), (1525, 1684), (1525, 1685), (1525, 1686), (1525, 1687), (1525, 1688), (1525, 1689), (1525, 1690), (1525, 1691), (1525, 1692), (1525, 1693), (1525, 1694), (1525, 1695), (1525, 1696), (1525, 1697), (1525, 1698), (1525, 1699), (1525, 1700), (1525, 1701), (1525, 1702), (1525, 1703), (1525, 1704), (1525, 1705), (1525, 1706), (1525, 1707), (1525, 1708), (1525, 1709), (1525, 1710), (1525, 1711), (1525, 1712), (1525, 1713), (1525, 1714), (1525, 1715), (1525, 1716), (1525, 1717), (1525, 1718), (1525, 1719), (1525, 1720), (1525, 1721), (1525, 1722), (1525, 1723), (1525, 1724), (1525, 1725), (1525, 1726), (1525, 1727), (1525, 1728), (1525, 1729), (1525, 1730), (1525, 1731), (1525, 1732), (1525, 1733), (1525, 1734), (1525, 1735), (1525, 1736), (1525, 1737), (1525, 1738), (1525, 1739), (1525, 1740), (1525, 1741), (1525, 1742), (1525, 1743), (1525, 1744), (1525, 1745), (1525, 1746), (1525, 1747), (1525, 1748), (1525, 1749), (1525, 1750), (1525, 1751), (1525, 1752), (1525, 1753), (1525, 1754), (1525, 1755), (1525, 1756), (1525, 1757), (1525, 1758), (1525, 1759), (1525, 1760), (1525, 1761), (1525, 1762), (1525, 1763), (1525, 1764), (1525, 1765), (1525, 1766), (1525, 1767), (1525, 1768), (1525, 1769), (1525, 1770), (1525, 1771), (1525, 1772), (1525, 1773), (1525, 1774), (1525, 1775), (1525, 1776), (1525, 1777), (1525, 1778), (1525, 1779), (1525, 1780), (1525, 1781), (1525, 1782), (1525, 1783), (1525, 1784), (1525, 1785), (1525, 1786), (1525, 1787), (1525, 1788), (1525, 1789), (1525, 1790), (1525, 1791), (1525, 1792), (1525, 1793), (1525, 1794), (1525, 1795), (1525, 1796), (1525, 1797), (1525, 1798), (1525, 1799), (1525, 1800), (1525, 1801), (1525, 1802), (1525, 1803), (1525, 1804), (1525, 1805), (1525, 1806), (1525, 1807), (1525, 1808), (1525, 1809), (1525, 1810), (1525, 1811), (1525, 1812), (1525, 1813), (1525, 1814), (1525, 1815), (1525, 1816), (1525, 1817), (1525, 1818), (1525, 1819), (1525, 1820), (1525, 1821), (1525, 1822), (1525, 1823), (1525, 1824), (1525, 1825), (1525, 1826), (1525, 1827), (1525, 1828), (1525, 1829), (1525, 1830), (1525, 1831), (1525, 1832), (1525, 1833), (1525, 1834), (1525, 1835), (1525, 1836), (1525, 1837), (1525, 1838), (1525, 1839), (1525, 1840), (1525, 1841), (1525, 1842), (1525, 1843), (1525, 1844), (1525, 1845), (1525, 1846), (1525, 1847), (1525, 1848), (1525, 1849), (1525, 1850), (1525, 1851), (1525, 1852), (1525, 1853), (1525, 1854), (1525, 1855), (1525, 1856), (1525, 1857), (1525, 1858), (1525, 1859), (1525, 1860), (1525, 1861), (1525, 1862), (1525, 1863), (1525, 1864), (1525, 1865), (1525, 1866), (1525, 1867), (1525, 1868), (1525, 1869), (1525, 1870), (1525, 1871), (1525, 1872), (1525, 1873), (1525, 1874), (1525, 1875), (1525, 1876), (1525, 1877), (1525, 1878), (1525, 1879), (1525, 1880), (1525, 1881), (1525, 1882), (1525, 1883), (1525, 1884), (1525, 1885), (1525, 1886), (1525, 1887), (1525, 1888), (1525, 1889), (1525, 1890), (1525, 1891), (1525, 1892), (1525, 1893), (1525, 1894), (1525, 1895), (1525, 1896), (1525, 1897), (1525, 1898), (1525, 1899), (1525, 1900), (1525, 1901), (1525, 1902), (1525, 1903), (1525, 1904), (1525, 1905), (1525, 1906), (1525, 1907), (1525, 1908), (1525, 1909), (1525, 1910), (1525, 1911), (1525, 1912), (1525, 1913), (1525, 1914), (1525, 1915), (1525, 1916), (1525, 1917), (1525, 1918), (1525, 1919), (1525, 1920), (1525, 1921), (1525, 1922), (1525, 1923), (1525, 1924), (1525, 1925), (1525, 1926), (1525, 1927), (1525, 1928), (1525, 1929), (1525, 1930), (1525, 1931), (1525, 1932), (1525, 1933), (1525, 1934), (1525, 1935), (1525, 1936), (1525, 1937), (1525, 1938), (1525, 1939), (1525, 1940), (1525, 1941), (1525, 1942), (1525, 1943), (1525, 1944), (1525, 1945), (1525, 1946), (1525, 1947), (1525, 1948), (1525, 1949), (1525, 1950), (1525, 1951), (1525, 1952), (1525, 1953), (1525, 1954), (1525, 1955), (1525, 1956), (1525, 1957), (1525, 1958), (1525, 1959), (1525, 1960), (1525, 1961), (1525, 1962), (1525, 1963), (1525, 1964), (1525, 1965), (1525, 1966), (1525, 1967), (1525, 1968), (1525, 1969), (1525, 1970), (1525, 1971), (1525, 1972), (1525, 1973), (1525, 1974), (1525, 1975), (1525, 1976), (1525, 1977), (1525, 1978), (1525, 1979), (1525, 1980), (1525, 1981), (1525, 1982), (1525, 1983), (1525, 1984), (1525, 1985), (1525, 1986), (1525, 1987), (1525, 1988), (1525, 1989), (1525, 1990), (1525, 1991), (1525, 1992), (1525, 1993), (1525, 1994), (1525, 1995), (1525, 1996), (1525, 1997), (1525, 1998), (1525, 1999), (1525, 2000), (1525, 2001), (1525, 2002), (1525, 2003), (1525, 2004), (1525, 2005), (1525, 2006), (1525, 2007), (1525, 2008), (1525, 2009), (1525, 2010), (1525, 2011), (1525, 2012), (1525, 2013), (1525, 2014), (1525, 2015), (1525, 2016), (1525, 2017), (1525, 2018), (1525, 2019), (1525, 2020), (1525, 2021), (1525, 2022), (1525, 2023), (1525, 2024), (1525, 2025), (1525, 2026), (1525, 2027), (1525, 2028), (1525, 2029), (1525, 2030), (1525, 2031), (1525, 2032), (1525, 2033), (1525, 2034), (1525, 2035), (1525, 2036), (1525, 2037), (1525, 2038), (1525, 2039), (1525, 2040), (1525, 2041), (1525, 2042), (1525, 2043), (1525, 2044), (1525, 2045), (1525, 2046), (1525, 2047), (1525, 2048), (1525, 2049), (1525, 2050), (1525, 2051), (1525, 2052), (1525, 2053), (1525, 2054), (1525, 2055), (1525, 2056), (1525, 2057), (1525, 2058), (1525, 2059), (1525, 2060), (1525, 2061), (1525, 2062), (1525, 2063), (1525, 2064), (1525, 2065), (1525, 2066), (1525, 2067), (1525, 2068), (1525, 2069), (1525, 2070), (1525, 2071), (1525, 2072), (1525, 2073), (1525, 2074), (1525, 2075), (1525, 2076), (1525, 2077), (1525, 2078), (1525, 2079), (1525, 2080), (1525, 2081), (1525, 2082), (1525, 2083), (1525, 2084), (1525, 2085), (1525, 2086), (1525, 2087), (1525, 2088), (1525, 2089), (1525, 2090), (1525, 2091), (1525, 2092), (1525, 2093), (1525, 2094), (1525, 2095), (1525, 2096), (1525, 2097), (1525, 2098), (1525, 2099), (1525, 2100), (1525, 2101), (1525, 2102), (1525, 2103), (1525, 2104), (1525, 2105), (1525, 2106), (1525, 2107), (1525, 2108), (1525, 2109), (1525, 2110), (1525, 2111), (1525, 2112), (1525, 2113), (1525, 2114), (1525, 2115), (1525, 2116), (1525, 2117), (1525, 2118), (1525, 2119), (1525, 2120), (1525, 2121), (1525, 2122), (1525, 2123), (1525, 2124), (1525, 2125), (1525, 2126), (1525, 2127), (1525, 2128), (1525, 2129), (1525, 2130), (1525, 2131), (1525, 2132), (1525, 2133), (1525, 2134), (1525, 2135), (1525, 2136), (1525, 2137), (1525, 2138), (1525, 2139), (1525, 2140), (1525, 2141), (1525, 2142), (1525, 2143), (1525, 2144), (1525, 2145), (1525, 2146), (1525, 2147), (1525, 2148), (1525, 2149), (1525, 2150), (1525, 2151), (1525, 2152), (1525, 2153), (1525, 2154), (1525, 2155), (1525, 2156), (1525, 2157), (1525, 2158), (1525, 2159), (1525, 2160), (1525, 2161), (1525, 2162), (1525, 2163), (1525, 2164), (1525, 2165), (1525, 2166), (1525, 2167), (1525, 2168), (1525, 2169), (1525, 2170), (1525, 2171), (1525, 2172), (1525, 2173), (1525, 2174), (1525, 2175), (1525, 2176), (1525, 2177), (1525, 2178), (1525, 2179), (1525, 2180), (1525, 2181), (1525, 2182), (1525, 2183), (1525, 2184), (1525, 2185), (1525, 2186), (1525, 2187), (1525, 2188), (1525, 2189), (1525, 2190), (1525, 2191), (1525, 2192), (1525, 2193), (1525, 2194), (1525, 2195), (1525, 2196), (1525, 2197), (1525, 2198), (1525, 2199), (1525, 2200), (1525, 2201), (1525, 2202), (1525, 2203), (1525, 2204), (1525, 2205), (1525, 2206), (1525, 2207), (1525, 2208), (1525, 2209), (1525, 2210), (1525, 2211), (1525, 2212), (1525, 2213), (1525, 2214), (1525, 2215), (1525, 2216), (1525, 2217), (1525, 2218), (1525, 2219), (1525, 2220), (1525, 2221), (1525, 2222), (1525, 2223), (1525, 2224), (1525, 2225), (1525, 2226), (1525, 2227), (1525, 2228), (1525, 2229), (1525, 2230), (1525, 2231), (603, 469), (603, 493), (603, 510), (603, 526), (603, 584), (603, 594), (603, 624), (603, 639), (603, 764), (603, 768), (603, 769), (603, 770), (603, 771), (603, 772), (603, 773), (603, 774), (603, 775), (603, 776), (603, 777), (603, 778), (603, 779), (603, 780), (603, 781), (603, 782), (603, 783), (603, 784), (603, 785), (603, 786), (603, 787), (603, 788), (603, 789), (603, 790), (603, 791), (603, 792), (603, 793), (603, 794), (603, 795), (603, 796), (603, 797), (603, 798), (603, 799), (603, 800), (603, 801), (603, 802), (603, 803), (603, 804), (603, 805), (603, 806), (603, 807), (603, 808), (603, 809), (603, 810), (603, 811), (603, 812), (603, 813), (603, 814), (603, 815), (603, 816), (603, 817), (603, 818), (603, 819), (603, 820), (603, 821), (603, 822), (603, 823), (603, 824), (603, 825), (603, 826), (603, 827), (603, 828), (603, 829), (603, 830), (603, 831), (603, 832), (603, 833), (603, 834), (603, 835), (603, 836), (603, 837), (603, 838), (603, 839), (603, 840), (603, 841), (603, 842), (603, 843), (603, 844), (603, 845), (603, 846), (603, 847), (603, 848), (603, 849), (603, 850), (603, 851), (603, 852), (603, 853), (603, 854), (603, 855), (603, 856), (603, 857), (603, 858), (603, 859), (603, 860), (603, 861), (603, 862), (603, 863), (603, 864), (603, 865), (603, 866), (603, 867), (603, 868), (603, 869), (603, 870), (603, 871), (603, 872), (603, 873), (603, 874), (603, 875), (603, 876), (603, 877), (603, 878), (603, 879), (603, 880), (603, 881), (603, 882), (603, 883), (603, 884), (603, 885), (603, 886), (603, 887), (603, 888), (603, 889), (603, 890), (603, 891), (603, 892), (603, 893), (603, 894), (603, 895), (603, 896), (603, 897), (603, 898), (603, 899), (603, 900), (603, 901), (603, 902), (603, 903), (603, 904), (603, 905), (603, 906), (603, 907), (603, 908), (603, 909), (603, 910), (603, 911), (603, 912), (603, 913), (603, 914), (603, 915), (603, 916), (603, 917), (603, 918), (603, 919), (603, 920), (603, 921), (603, 922), (603, 923), (603, 924), (603, 925), (603, 926), (603, 927), (603, 928), (603, 929), (603, 930), (603, 931), (603, 932), (603, 933), (603, 934), (603, 935), (603, 936), (603, 937), (603, 938), (603, 939), (603, 940), (603, 941), (603, 942), (603, 943), (603, 944), (603, 945), (603, 946), (603, 947), (603, 948), (603, 949), (603, 950), (603, 951), (603, 952), (603, 953), (603, 954), (603, 955), (603, 956), (603, 957), (603, 958), (603, 959), (603, 960), (603, 961), (603, 962), (603, 963), (603, 964), (603, 965), (603, 966), (603, 967), (603, 968), (603, 969), (603, 970), (603, 971), (603, 972), (603, 973), (603, 974), (603, 975), (603, 976), (603, 977), (603, 978), (603, 979), (603, 980), (603, 981), (603, 982), (603, 983), (603, 984), (603, 985), (603, 986), (603, 987), (603, 988), (603, 989), (603, 990), (603, 991), (603, 992), (603, 993), (603, 994), (603, 995), (603, 996), (603, 997), (603, 998), (603, 999), (603, 1000), (603, 1001), (603, 1002), (603, 1003), (603, 1004), (603, 1005), (603, 1006), (603, 1007), (603, 1008), (603, 1009), (603, 1010), (603, 1011), (603, 1012), (603, 1013), (603, 1014), (603, 1015), (603, 1016), (603, 1017), (603, 1018), (603, 1019), (603, 1020), (603, 1021), (603, 1022), (603, 1023), (603, 1024), (603, 1025), (603, 1026), (603, 1027), (603, 1028), (603, 1029), (603, 1030), (603, 1031), (603, 1032), (603, 1033), (603, 1034), (603, 1035), (603, 1036), (603, 1037), (603, 1038), (603, 1039), (603, 1040), (603, 1041), (603, 1042), (603, 1043), (603, 1044), (603, 1045), (603, 1046), (603, 1047), (603, 1048), (603, 1049), (603, 1050), (603, 1051), (603, 1052), (603, 1053), (603, 1054), (603, 1055), (603, 1056), (603, 1057), (603, 1058), (603, 1059), (603, 1060), (603, 1061), (603, 1062), (603, 1063), (603, 1064), (603, 1065), (603, 1066), (603, 1067), (603, 1068), (603, 1069), (603, 1070), (603, 1071), (603, 1072), (603, 1073), (603, 1074), (603, 1075), (603, 1076), (603, 1077), (603, 1078), (603, 1079), (603, 1080), (603, 1081), (603, 1082), (603, 1083), (603, 1084), (603, 1085), (603, 1086), (603, 1087), (603, 1088), (603, 1089), (603, 1090), (603, 1091), (603, 1092), (603, 1093), (603, 1094), (603, 1095), (603, 1096), (603, 1097), (603, 1098), (603, 1099), (603, 1100), (603, 1101), (603, 1102), (603, 1103), (603, 1104), (603, 1105), (603, 1106), (603, 1107), (603, 1108), (603, 1109), (603, 1110), (603, 1111), (603, 1112), (603, 1113), (603, 1114), (603, 1115), (603, 1116), (603, 1117), (603, 1118), (603, 1119), (603, 1120), (603, 1121), (603, 1122), (603, 1123), (603, 1124), (603, 1125), (603, 1126), (603, 1127), (603, 1128), (603, 1129), (603, 1130), (603, 1131), (603, 1132), (603, 1133), (603, 1134), (603, 1135), (603, 1136), (603, 1137), (603, 1138), (603, 1139), (603, 1140), (603, 1141), (603, 1142), (603, 1143), (603, 1144), (603, 1145), (603, 1146), (603, 1147), (603, 1148), (603, 1149), (603, 1150), (603, 1151), (603, 1152), (603, 1153), (603, 1154), (603, 1155), (603, 1156), (603, 1157), (603, 1158), (603, 1159), (603, 1160), (603, 1161), (603, 1162), (603, 1163), (603, 1164), (603, 1165), (603, 1166), (603, 1167), (603, 1168), (603, 1169), (603, 1170), (603, 1171), (603, 1172), (603, 1173), (603, 1174), (603, 1175), (603, 1176), (603, 1177), (603, 1178), (603, 1179), (603, 1180), (603, 1181), (603, 1182), (603, 1183), (603, 1184), (603, 1185), (603, 1186), (603, 1187), (603, 1188), (603, 1189), (603, 1190), (603, 1191), (603, 1192), (603, 1193), (603, 1194), (603, 1195), (603, 1196), (603, 1197), (603, 1198), (603, 1199), (603, 1200), (603, 1201), (603, 1202), (603, 1203), (603, 1204), (603, 1205), (603, 1206), (603, 1207), (603, 1208), (603, 1209), (603, 1210), (603, 1211), (603, 1212), (603, 1213), (603, 1214), (603, 1215), (603, 1216), (603, 1217), (603, 1218), (603, 1219), (603, 1220), (603, 1221), (603, 1222), (603, 1223), (603, 1224), (603, 1225), (603, 1226), (603, 1227), (603, 1228), (603, 1229), (603, 1230), (603, 1231), (603, 1232), (603, 1233), (603, 1234), (603, 1235), (603, 1236), (603, 1237), (603, 1238), (603, 1239), (603, 1240), (603, 1241), (603, 1242), (603, 1243), (603, 1244), (603, 1245), (603, 1246), (603, 1247), (603, 1248), (603, 1249), (603, 1250), (603, 1251), (603, 1252), (603, 1253), (603, 1254), (603, 1255), (603, 1256), (603, 1257), (603, 1258), (603, 1259), (603, 1260), (603, 1261), (603, 1262), (603, 1263), (603, 1264), (603, 1265), (603, 1266), (603, 1267), (603, 1268), (603, 1269), (603, 1270), (603, 1271), (603, 1272), (603, 1273), (603, 1274), (603, 1275), (603, 1276), (603, 1277), (603, 1278), (603, 1279), (603, 1280), (603, 1281), (603, 1282), (603, 1283), (603, 1284), (603, 1285), (603, 1286), (603, 1287), (603, 1288), (603, 1289), (603, 1290), (603, 1291), (603, 1292), (603, 1293), (603, 1294), (603, 1295), (603, 1296), (603, 1297), (603, 1298), (603, 1299), (603, 1300), (603, 1301), (603, 1302), (603, 1303), (603, 1304), (603, 1305), (603, 1306), (603, 1307), (603, 1308), (603, 1309), (603, 1310), (603, 1311), (603, 1312), (603, 1313), (603, 1314), (603, 1315), (603, 1316), (603, 1317), (603, 1318), (603, 1319), (603, 1320), (603, 1321), (603, 1322), (603, 1323), (603, 1324), (603, 1325), (603, 1326), (603, 1327), (603, 1328), (603, 1329), (603, 1330), (603, 1331), (603, 1332), (603, 1333), (603, 1334), (603, 1335), (603, 1336), (603, 1337), (603, 1338), (603, 1339), (603, 1340), (603, 1341), (603, 1342), (603, 1343), (603, 1344), (603, 1345), (603, 1346), (603, 1347), (603, 1348), (603, 1349), (603, 1350), (603, 1351), (603, 1352), (603, 1353), (603, 1354), (603, 1355), (603, 1356), (603, 1357), (603, 1358), (603, 1359), (603, 1360), (603, 1361), (603, 1362), (603, 1363), (603, 1364), (603, 1365), (603, 1366), (603, 1367), (603, 1368), (603, 1369), (603, 1370), (603, 1371), (603, 1372), (603, 1373), (603, 1374), (603, 1375), (603, 1376), (603, 1377), (603, 1378), (603, 1379), (603, 1380), (603, 1381), (603, 1382), (603, 1383), (603, 1384), (603, 1385), (603, 1386), (603, 1387), (603, 1388), (603, 1389), (603, 1390), (603, 1391), (603, 1392), (603, 1393), (603, 1394), (603, 1395), (603, 1396), (603, 1397), (603, 1398), (603, 1399), (603, 1400), (603, 1401), (603, 1402), (603, 1403), (603, 1404), (603, 1405), (603, 1406), (603, 1407), (603, 1408), (603, 1409), (603, 1410), (603, 1411), (603, 1412), (603, 1413), (603, 1414), (603, 1415), (603, 1416), (603, 1417), (603, 1418), (603, 1419), (603, 1420), (603, 1421), (603, 1422), (603, 1423), (603, 1424), (603, 1425), (603, 1426), (603, 1427), (603, 1428), (603, 1429), (603, 1430), (603, 1431), (603, 1432), (603, 1433), (603, 1434), (603, 1435), (603, 1436), (603, 1437), (603, 1438), (603, 1439), (603, 1440), (603, 1441), (603, 1442), (603, 1443), (603, 1444), (603, 1445), (603, 1446), (603, 1447), (603, 1448), (603, 1449), (603, 1450), (603, 1451), (603, 1452), (603, 1453), (603, 1454), (603, 1455), (603, 1456), (603, 1457), (603, 1458), (603, 1459), (603, 1460), (603, 1461), (603, 1462), (603, 1463), (603, 1464), (603, 1465), (603, 1466), (603, 1467), (603, 1468), (603, 1469), (603, 1470), (603, 1471), (603, 1472), (603, 1473), (603, 1474), (603, 1475), (603, 1476), (603, 1477), (603, 1478), (603, 1479), (603, 1480), (603, 1481), (603, 1482), (603, 1483), (603, 1484), (603, 1485), (603, 1486), (603, 1487), (603, 1488), (603, 1489), (603, 1490), (603, 1491), (603, 1492), (603, 1493), (603, 1494), (603, 1495), (603, 1496), (603, 1497), (603, 1498), (603, 1499), (603, 1500), (603, 1501), (603, 1502), (603, 1503), (603, 1504), (603, 1505), (603, 1506), (603, 1507), (603, 1508), (603, 1509), (603, 1510), (603, 1511), (603, 1512), (603, 1513), (603, 1514), (603, 1515), (603, 1516), (603, 1517), (603, 1518), (603, 1519), (603, 1520), (603, 1521), (603, 1522), (603, 1523), (603, 1524), (710, 711), (710, 712), (710, 713), (710, 714), (710, 715), (710, 716), (710, 717), (710, 718), (710, 719), (710, 720), (710, 2329), (710, 2330), (710, 2331), (710, 2332), (710, 2333), (710, 2334), (710, 2335), (710, 2336), (710, 2337), (710, 2338), (710, 2339), (710, 2340), (710, 2341), (710, 2342), (710, 2343), (710, 2344), (710, 2345), (710, 2346), (710, 2347), (710, 2348), (710, 2349), (710, 2350), (710, 2351), (710, 2352), (710, 2353), (710, 2354), (710, 2355), (710, 2356), (710, 2357), (710, 2358), (710, 2359), (710, 2360), (710, 2361), (710, 2362), (710, 2363), (710, 2364), (710, 2365), (710, 2366), (710, 2367), (710, 2368), (710, 2369), (710, 2370), (710, 2371), (710, 2372), (710, 2373), (710, 2374), (710, 2375), (710, 2376), (710, 2377), (710, 2378), (710, 2379), (710, 2380), (710, 2381), (710, 2382), (710, 2383), (710, 2384), (710, 2385), (710, 2386), (710, 2387), (710, 2388), (710, 2389), (710, 2390), (710, 2391), (710, 2392), (710, 2393), (710, 2394), (710, 2395), (710, 2396), (710, 2397), (710, 2398), (710, 2399), (710, 2400), (710, 2401), (710, 2402), (710, 2403), (710, 2404), (710, 2405), (710, 2406), (710, 2407), (710, 2408), (710, 2409), (710, 2410), (710, 2411), (710, 2412), (710, 2413), (710, 2414), (710, 2415), (710, 2416), (710, 2417), (710, 2418), (710, 2419), (710, 2420), (710, 2421), (710, 2422), (710, 2423), (710, 2424), (710, 2425), (710, 2426), (710, 2427), (710, 2428), (710, 2429), (710, 2430), (710, 2431), (710, 2432), (710, 2433), (710, 2434), (710, 2435), (710, 2436), (710, 2437), (710, 2438), (710, 2439), (710, 2440), (710, 2441), (710, 2442), (710, 2443), (710, 2444), (710, 2445), (710, 2446), (710, 2447), (710, 2448), (710, 2449), (710, 2450), (710, 2451), (710, 2452), (710, 2453), (710, 2454), (710, 2455), (710, 2456), (710, 2457), (710, 2458), (710, 2459), (710, 2460), (710, 2461), (710, 2462), (710, 2463), (710, 2464), (710, 2465), (710, 2466), (710, 2467), (710, 2468), (710, 2469), (710, 2470), (710, 2471), (710, 2472), (710, 2473), (710, 2474), (710, 2475), (710, 2476), (710, 2477), (710, 2478), (710, 2479), (710, 2480), (710, 2481), (710, 2482), (710, 2483), (710, 2484), (710, 2485), (710, 2486), (710, 2487), (710, 2488), (710, 2489), (710, 2490), (710, 2491), (710, 2492), (710, 2493), (710, 2494), (710, 2495), (710, 2496), (710, 2497), (710, 2498), (710, 2499), (710, 2500), (710, 2501), (710, 2502), (710, 2503), (710, 2504), (710, 2505), (710, 2506), (710, 2507), (710, 2508), (710, 2509), (710, 2510), (710, 2511), (710, 2512), (710, 2513), (710, 2514), (710, 2515), (710, 2516), (710, 2517), (710, 2518), (710, 2519), (710, 2520), (710, 2521), (710, 2522), (710, 2523), (710, 2524), (710, 2525), (710, 2526), (710, 2527), (710, 2528), (710, 2529), (710, 2530), (710, 2531), (710, 2532), (710, 2533), (710, 2534), (710, 2535), (714, 711), (714, 716), (714, 719), (714, 720), (714, 721), (714, 722), (714, 2348), (714, 2351), (714, 2352), (714, 2354), (714, 2356), (714, 2366), (714, 2369), (714, 2370), (714, 2375), (714, 2386), (714, 2394), (714, 2395), (714, 2399), (714, 2402), (714, 2405), (714, 2407), (714, 2409), (714, 2431), (714, 2434), (714, 2444), (714, 2452), (714, 2461), (714, 2465), (714, 2469), (714, 2475), (714, 2482), (714, 2483), (714, 2484), (714, 2492), (714, 2509), (714, 2511), (714, 2518), (714, 2521), (714, 2523), (714, 2524), (714, 2526), (714, 2530), (714, 2594), (714, 2595), (714, 2596), (714, 2597), (714, 2598), (714, 2599), (714, 2600), (714, 2601), (714, 2602), (714, 2603), (714, 2604), (714, 2605), (714, 2606), (714, 2607), (714, 2608), (714, 2609), (714, 2610), (714, 2611), (714, 2612), (714, 2613), (714, 2614), (714, 2615), (714, 2616), (714, 2617), (714, 2618), (714, 2619), (714, 2620), (714, 2621), (714, 2622), (714, 2623), (714, 2624), (714, 2625), (714, 2626), (714, 2627), (714, 2628), (714, 2629), (714, 2630), (714, 2631), (714, 2632), (714, 2633), (714, 2634), (714, 2635), (714, 2636), (714, 2637), (714, 2638), (714, 2639), (714, 2640), (714, 2641), (714, 2642), (714, 2643), (714, 2644), (714, 2645), (714, 2646), (714, 2647), (714, 2648), (714, 2649), (714, 2650), (714, 2651), (714, 2652), (714, 2653), (714, 2654), (714, 2655), (714, 2656), (714, 2657), (714, 2658), (714, 2659), (714, 2660), (714, 2661), (714, 2662), (714, 2663), (714, 2664), (714, 2665), (714, 2666), (714, 2667), (714, 2668), (714, 2669), (714, 2670), (714, 2671), (714, 2672), (714, 2673), (714, 2674), (714, 2675), (714, 2676), (714, 2677), (714, 2678), (714, 2679), (714, 2680), (714, 2681), (714, 2682), (714, 2683), (714, 2684), (714, 2685), (714, 2686), (335, 2232), (2232, 2233), (2232, 2234), (2232, 2235), (2232, 2236), (2232, 2237), (2232, 2238), (2232, 2239), (2232, 2240), (2232, 2241), (2232, 2242), (2232, 2243), (2232, 2244), (2232, 2245), (2232, 2246), (2232, 2247), (2232, 2248), (2232, 2249), (2232, 2250), (2232, 2251), (2232, 2252), (2232, 2253), (2232, 2254), (2232, 2255), (2232, 2256), (2232, 2257), (2232, 2258), (2232, 2259), (2232, 2260), (2232, 2261), (2232, 2262), (2232, 2263), (2232, 2264), (2232, 2265), (2232, 2266), (2232, 2267), (2232, 2268), (2232, 2269), (2232, 2270), (2232, 2271), (2232, 2272), (2232, 2273), (2232, 2274), (2232, 2275), (2232, 2276), (2232, 2277), (2232, 2278), (2232, 2279), (2232, 2280), (2232, 2281), (2232, 2282), (2232, 2283), (2232, 2284), (2232, 2285), (2232, 2286), (2232, 2287), (2232, 2288), (2232, 2289), (2232, 2290), (2232, 2291), (2232, 2292), (2232, 2293), (2232, 2294), (2232, 2295), (2232, 2296), (2232, 2297), (2232, 2298), (2232, 2299), (2232, 2300), (2232, 2301), (2232, 2302), (2232, 2303), (2232, 2304), (2232, 2305), (2232, 2306), (2232, 2307), (2232, 2308), (2232, 2309), (2232, 2310), (2232, 2311), (2232, 2312), (2232, 2313), (2232, 2314), (2232, 2315), (2232, 2316), (2232, 2317), (2232, 2318), (2232, 2319), (2232, 2320), (2232, 2321), (2232, 2322), (2232, 2323), (2232, 2324), (2232, 2325), (2232, 2326), (2232, 2327), (2232, 2328), (1524, 2699), (2594, 2536), (2699, 2687), (2699, 2698), (2699, 2709), (2699, 2714), (2699, 2720), (2699, 2730), (2699, 2746), (2699, 2748), (2699, 2754), (2699, 2773), (2699, 2775), (2699, 2777), (2699, 2801), (2699, 2804), (2699, 2805), (2699, 2806), (2699, 2811), (2699, 2824), (2699, 2826), (2699, 2829), (2699, 2831), (2699, 2841), (2699, 2857), (2699, 2858), (2699, 2859), (2699, 2860), (2699, 2861), (2699, 2862), (2699, 2863), (2699, 2864), (2699, 2865), (2699, 2866), (2699, 2867), (2699, 2868), (2699, 2869), (2699, 2870), (2699, 2871), (2699, 2872), (2699, 2873), (2699, 2874), (2699, 2875), (2699, 2876), (2699, 2877), (2699, 2878), (2699, 2879), (2699, 2880), (2699, 2881), (2699, 2882), (2699, 2883), (2699, 2884), (2699, 2885), (2699, 2886), (2699, 2887), (2699, 2888), (2536, 2537), (2536, 2538), (2536, 2539), (2536, 2540), (2536, 2541), (2536, 2542), (2536, 2543), (2536, 2544), (2536, 2545), (2536, 2546), (2536, 2547), (2536, 2548), (2536, 2549), (2536, 2550), (2536, 2551), (2536, 2552), (2536, 2553), (2536, 2554), (2536, 2555), (2536, 2556), (2536, 2557), (2536, 2558), (2536, 2559), (2536, 2560), (2536, 2561), (2536, 2562), (2536, 2563), (2536, 2564), (2536, 2565), (2536, 2566), (2536, 2567), (2536, 2568), (2536, 2569), (2536, 2570), (2536, 2571), (2536, 2572), (2536, 2573), (2536, 2574), (2536, 2575), (2536, 2576), (2536, 2577), (2536, 2578), (2536, 2579), (2536, 2580), (2536, 2581), (2536, 2582), (2536, 2583), (2536, 2584), (2536, 2585), (2536, 2586), (2536, 2587), (2536, 2588), (2536, 2589), (2536, 2590), (2536, 2591), (2536, 2592), (2536, 2593), (2687, 2688), (2687, 2689), (2687, 2690), (2687, 2691), (2687, 2692), (2687, 2693), (2687, 2694), (2687, 2695), (2687, 2696), (2687, 2697), (2687, 2698), (2687, 2700), (2687, 2701), (2687, 2702), (2687, 2703), (2687, 2704), (2687, 2705), (2687, 2706), (2687, 2707), (2687, 2708), (2687, 2709), (2687, 2710), (2687, 2711), (2687, 2712), (2687, 2713), (2687, 2714), (2687, 2715), (2687, 2716), (2687, 2717), (2687, 2718), (2687, 2719), (2687, 2720), (2687, 2721), (2687, 2722), (2687, 2723), (2687, 2724), (2687, 2725), (2687, 2726), (2687, 2727), (2687, 2728), (2687, 2729), (2687, 2730), (2687, 2731), (2687, 2732), (2687, 2733), (2687, 2734), (2687, 2735), (2687, 2736), (2687, 2737), (2687, 2738), (2687, 2739), (2687, 2740), (2687, 2741), (2687, 2742), (2687, 2743), (2687, 2744), (2687, 2745), (2687, 2746), (2687, 2747), (2687, 2748), (2687, 2749), (2687, 2750), (2687, 2751), (2687, 2752), (2687, 2753), (2687, 2754), (2687, 2755), (2687, 2756), (2687, 2757), (2687, 2758), (2687, 2759), (2687, 2760), (2687, 2761), (2687, 2762), (2687, 2763), (2687, 2764), (2687, 2765), (2687, 2766), (2687, 2767), (2687, 2768), (2687, 2769), (2687, 2770), (2687, 2771), (2687, 2772), (2687, 2773), (2687, 2774), (2687, 2775), (2687, 2776), (2687, 2777), (2687, 2778), (2687, 2779), (2687, 2780), (2687, 2781), (2687, 2782), (2687, 2783), (2687, 2784), (2687, 2785), (2687, 2786), (2687, 2787), (2687, 2788), (2687, 2789), (2687, 2790), (2687, 2791), (2687, 2792), (2687, 2793), (2687, 2794), (2687, 2795), (2687, 2796), (2687, 2797), (2687, 2798), (2687, 2799), (2687, 2800), (2687, 2801), (2687, 2802), (2687, 2803), (2687, 2804), (2687, 2805), (2687, 2806), (2687, 2807), (2687, 2808), (2687, 2809), (2687, 2810), (2687, 2811), (2687, 2812), (2687, 2813), (2687, 2814), (2687, 2815), (2687, 2816), (2687, 2817), (2687, 2818), (2687, 2819), (2687, 2820), (2687, 2821), (2687, 2822), (2687, 2823), (2687, 2824), (2687, 2825), (2687, 2826), (2687, 2827), (2687, 2828), (2687, 2829), (2687, 2830), (2687, 2831), (2687, 2832), (2687, 2833), (2687, 2834), (2687, 2835), (2687, 2836), (2687, 2837), (2687, 2838), (2687, 2839), (2687, 2840), (2687, 2841), (2687, 2842), (2687, 2843), (2687, 2844), (2687, 2845), (2687, 2846), (2687, 2847), (2687, 2848), (2687, 2849), (2687, 2850), (2687, 2851), (2687, 2852), (2687, 2853), (2687, 2854), (2687, 2855), (2687, 2856), (2687, 2857)])

Edges are always shown as a pair on nodes, i.e., one edge represents the connection between two nodes. Nodes are always a single value, as shown above.


Visualisation¶


In [68]:
# Use `nx.draw_networkx` to visualise graphs
plt.figure(figsize=(10,8))
nx.draw_networkx(net_graph)

The above is a representation of the entire graph, with all the nodes and edges present. The nodes are represented in blue and the dark marks are the edges. We had already seen that some nodes were going to have multiple edges and the above plot confirms that assumption.

In [69]:
# Visualise just the nodes
nx.draw_networkx_nodes(net_graph, pos=nx.spring_layout(net_graph))
Out[69]:
<matplotlib.collections.PathCollection at 0x7f986067aaf0>
In [70]:
# Visualise just the edges
nx.draw_networkx_edges(net_graph, pos=nx.kamada_kawai_layout(net_graph))
Out[70]:
<matplotlib.collections.LineCollection at 0x7f985988ea60>

For other, specialised methods of drawing a network graph, we can use the below layouts to get a better understanding of how the graph behaves.

In [71]:
G=net_graph
layout = nx.spring_layout(G)
plt.title('Spring Layout of Social Network Graph')
nx.draw(G,pos=layout,node_size=150,alpha=0.5)
In [72]:
G=net_graph
layout = nx.spectral_layout(G)
plt.title('Spectral Layout of Social Network Graph')
nx.draw(G,pos=layout,node_size=150,alpha=0.5)

After experimenting with a few graphs, we can confirm that the two layouts above are the best layouts for our dataset.

We attempted to draw a planar graph but our dataset isn't a planar and an error was triggered.

Kamada Kawai layout was too cluttered and difficult to interpret which is why it was skipped.


Statistical Analysis¶


Properties¶

Property Description
Degree The degree of a node in a network is just the number of edges the node has connected to it.

nx.degree(g,n) will return the degree of node n.
Connected components The separate components that make up the network. A component is a set of nodes from which it is possible to reach all other nodes in the set.

nx.number_connected_components(g) returns the number of connected components in the graph.
Diameter The largest possible number of edges that must be traversed to travel on the shortest path between two nodes in the network.

nx.diameter(g) will return the diameter of graph g.
Density The density of a graph is a measure of how many edges it has relative to the total number of possible edges it could have.

nx.density(g) will return the density of graph g.
Shortest path The shortest route along edges in the graph from one node to another. If edges are weighted, the edge weight can be counted as the length of the edge.

nx.shortest_path(g, n1, n2) will return the shortest path between nodes n1 and n2. nx.shortest_path(g) returns the shortest path for each edge.
In [73]:
nx.degree(net_graph, 1525)
Out[73]:
710
In [74]:
nx.degree(net_graph, 603)
Out[74]:
769
In [75]:
nx.degree(net_graph, 288)
Out[75]:
481
In [76]:
nx.number_connected_components(net_graph)
Out[76]:
1
In [77]:
nx.diameter(net_graph)
Out[77]:
9
In [78]:
nx.density(net_graph)
Out[78]:
0.0007150690793671507
In [79]:
nx.shortest_path(net_graph, 1, 2720)
Out[79]:
[1, 69, 603, 1524, 2699, 2720]

Statistics¶


Degree Centrality¶

Degree Centrality measures the importance of a node in a network, solely based on the number of connections coming out from it, higher the connections, higher the degree centrality.

In [80]:
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures

degree_centrality = nx.centrality.degree_centrality(net_graph)
(sorted(degree_centrality.items(), key=lambda item: item[1], reverse=True))[:10]
Out[80]:
[(603, 0.2663664703844822),
 (1525, 0.24593003117422932),
 (288, 0.16660893661240042),
 (1, 0.09941115344648424),
 (710, 0.07655005195704884),
 (2687, 0.05888465535157603),
 (714, 0.04814686525805335),
 (2232, 0.03359889158295809),
 (2536, 0.020090058884655353),
 (2699, 0.019050917907862834)]

In the above code, we determine the degree centrality of the graph and only display the top ten highest values.

In [81]:
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures
(sorted(net_graph.degree, key=lambda item: item[1], reverse=True))[:10]
Out[81]:
[(603, 769),
 (1525, 710),
 (288, 481),
 (1, 287),
 (710, 221),
 (2687, 170),
 (714, 139),
 (2232, 97),
 (2536, 58),
 (2699, 55)]

Here, we check the number of neighbours for a particular node and display the top ten.

In [82]:
plt.figure(figsize=(8, 5))
plt.hist(degree_centrality.values(), bins=25)
plt.xticks(ticks=[0, 0.05, 0.1, 0.15, 0.2, 0.25])
plt.title("Degree Centrality Histogram ", fontdict={"size": 15}, loc="center")
plt.xlabel("Degree Centrality", fontdict={"size": 10})
plt.ylabel("Counts", fontdict={"size": 10})
Out[82]:
Text(0, 0.5, 'Counts')
In [83]:
pos = pos = nx.spring_layout(net_graph)
node_size = [v * 1000 for v in degree_centrality.values()] 
plt.figure(figsize=(15, 8))
nx.draw_networkx(net_graph, pos=pos, node_size=node_size, with_labels=False, width=0.15)
plt.axis("off")
Out[83]:
(-0.6573494791984558,
 1.060746443271637,
 -1.0745257794857026,
 0.5650426805019378)

The above two plots/graphs give a visual representation of how connected the graph is based on just a few important nodes.

In the case of our social network graph, degree centrality is essentially used to understand how many connections (internet friends) a particular user has. From our graph, we know node 603 has many connections. Therefore, this node has a high degree centrality and has many internet friends. A percentage of 26% is another way to say they are connected to 26% of the network.

Degree Distribution¶

Degree distribution is the frequency with which nodes in the network have a degree sequence.

In [84]:
# Use the function `.degree_histogram()`. 
# We skip the nodes that have a degree 0 
ddist = nx.degree_histogram(net_graph)[1:]
plt.loglog(range(1,len(ddist)+1),ddist,'o')

plt.title('Degree Distribution')
plt.xlabel('Degree')
plt.ylabel('Frequency')
Out[84]:
Text(0, 0.5, 'Frequency')

From the above plot, we can determine that most nodes have a degree of 1 and count of nodes with greater degrees decreases as the distribution plateaus.

From our social network graph, we can conclude that most of the nodes (users) have a low percentage of friends, with very few having more connections.


Clustering Coefficient¶

Clustering Coefficient looks at how interconnected the neighbours of a node in a graph are.

The local clustering coefficient of a node is defined as: $$ C = \frac{2E_N}{k(k-1)} $$

Here, $E_N$ is the total number of edges between neighbours of the node, and $k$ is the number of neighbours.

In [85]:
ccg = nx.clustering(net_graph)
plt.hist(list(ccg.values()),bins='auto')
plt.title('Clustering Coefficient')
plt.xlabel("Clustering Coefficient")
plt.ylabel("Frequency")
Out[85]:
Text(0, 0.5, 'Frequency')

We know that most of the nodes are connected to a single node and hence only possess one neighbour. That is the reason for the clustering coefficient here to be 0 as most of the nodes aren't connected to anything else.

In the case of our social network graph, a low clustering coefficient corresponds to a network wherein the connections don't form a tight-knit community.


Betweenness Centrality¶

Betweenness centrality measures how important a particular node is in a network by finding the count of shortest paths taken with that particular node being passed.

Nodes that have high betweenness centrality are ones that usually connect different regions of the network.

In [86]:
betweenness_centrality = nx.betweenness_centrality(net_graph)
(sorted(betweenness_centrality.items(), key=lambda item: item[1], reverse=True))[:10]
Out[86]:
[(603, 0.5497065448918781),
 (288, 0.46612992918844975),
 (1525, 0.4294450041419194),
 (247, 0.24124220674273653),
 (1, 0.1860965105682874),
 (2699, 0.13099957488596214),
 (1524, 0.13019147414713747),
 (710, 0.12724689931998354),
 (714, 0.11276804568283787),
 (2687, 0.09928765193746143)]

The above code sorts the betweenness centrality of the network to display the most important nodes in the network.

We construct a plot and a graph below to show how disproportionate the important nodes are.

In [87]:
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures

plt.figure(figsize=(15, 8))
plt.hist(betweenness_centrality.values(), bins=100)
plt.xticks(ticks=[0, 0.062, 0.1, 0.128, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7])
plt.title("Betweenness Centrality Histogram ", fontdict={"size": 35}, loc="center")
plt.xlabel("Betweenness Centrality", fontdict={"size": 20})
plt.ylabel("Counts", fontdict={"size": 20})
Out[87]:
Text(0, 0.5, 'Counts')
In [88]:
# Source: https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html#centrality-measures

pos = pos = nx.spring_layout(net_graph)
node_size = [v * 1000 for v in betweenness_centrality.values()]
plt.figure(figsize=(15, 8))
nx.draw_networkx(net_graph, pos=pos, node_size=node_size, with_labels=False, width=0.15)
plt.axis("off")
Out[88]:
(-1.0439851224422454,
 0.728183263540268,
 -1.0939665377140044,
 0.9732972919940949)

In a social network, betweenness centrality is used to gauge how well-connected a user is to different groups and high betweenness centrality indicates connection to many different groups and able to facilitate communication and information flow between them.

Someone with a low betweenness centrality may have fewer connections to other groups and may not be as effective at facilitating the above. This is the case for most nodes in our social networks, whereas nodes such as 603, 288, 1525 have the top three values of betweenness centrality.

If we look at the previous section and check the degree of 288, we find it to be 481. An assumption would be to consider this as any other low connected node. But when we look at the betweenness centrality of the same node, it can be seen that it is one of the most important nodes in the network as many shortest paths pass through node 288. Therefore, degree is not the only form of gauging the importance of a node.


Assortativity¶

Assortativity refers to the tendency of nodes in a network to connect to other nodes that are similar (attributes or characteristics) to them.

In our social network, assortativity refers to nodes (users) that connect to other nodes with similar demographics.

There are two types of assortativity, positive assortativity and negative assortativity. Positive assortativity refers to nodes connecting to other nodes with similar number of connections (same degree), i.e., with similar attributes as in the case of our social network. Negative assortativity, on the other hand, refers to nodes that connect with other dissimilar nodes in terms of degree or attributes.

In [89]:
nx.degree_assortativity_coefficient(net_graph)
Out[89]:
-0.6682140067239861

In a network with negative assortativity, nodes with dissimilar characteristics may be more likely to be connected, which could lead to more efficient transmission of information or ideas within the network.


Resources¶

  • Pandas plotting

    • Chart visualisation: https://pandas.pydata.org/docs/user_guide/visualization.html
  • Matplotlib

    • Pyplot tutorial: https://matplotlib.org/stable/tutorials/introductory/pyplot.html
    • Colormaps: https://matplotlib.org/stable/gallery/color/colormap_reference.html
  • Seaborn

    • Tutorial: https://seaborn.pydata.org/tutorial.html
  • Plotly

    • Tutorial: https://plotly.com/python/sunburst-charts/
  • Scikit Learn

    • Tutorial: https://scikit-learn.org/stable/user_guide.html
  • Scipy

    • Tutorial: https://docs.scipy.org/doc/scipy/
  • Networkx

    • https://networkx.org/documentation/stable/tutorial.html
    • https://networkx.org/nx-guides/content/exploratory_notebooks/facebook_notebook.html
In [ ]: